Word Count/Character Count in Editor Using Scripting (JS)

Hi @Sebastian,

Is there a simple way to use JS in scripting of a readTime value for a Schema field of type string where characters can be counted for vanilla/custom editor and then saved as a number field in Squidex. On every update of the content the formula reruns on the character/word count there by allowing a simple read time kind of functionality at Squidex Level.

Hi @Sebastian , Could you please help on this query?

Editor:

We have a custom editor included in squidex like,

The editor has word count at the footer section of editor.

Requirement:

Now our requirement is we need to update the word count from the custom editor to the input field which we have placed next to it which has been named as wordCount?

As per the example the word count of 2 needs to be updated in the input field wordCount .

Could you please help us to achieve the expected result through any JS scripts?

I do not see a direct fix, but you could use a json field with a custom editor.

Instead of just saving the string you save your content as a json object.

{
  "text": "<p>Hello World</p>",
  "wordCount": 2
}

The only disadvantage is that json fields cannot be searched.

You could also use the approach to add two more fields:

  • textAndWordCount (JSON)
  • text (String)
  • wordCount (Number)

then you a normal schema to extract both fields from the json field

ctx.data.text.iv = ctx.data.textAndWordCount.iv.text
ctx.data.wordCount.iv = ctx.data.wordCount.iv.text
replace()

another alternative is a script, but this is complicated.

First you have to get the texts from the html, this is complicated and you have to search for a regex solution or so.

Then word count should be easy as long as you do not have to deal with chinese characters or so.

I have added a few helpers to the scripting:

  • html2Text: Converts html to plain text
  • markdown2Text: Converts markdown to plain text
  • characterCount: Counts the number of characters in a text.
  • wordCount: Counts the number of words in a text.

In your case you could do something like

ctx.data.wordCount.iv = wordCount(html2Text(ctx.data.text.iv));
replace();

ATTENTION: It is not deployed yet.

Hi @Sebastian , We have included this last approach which you have mentioned latest.

And in the article when I type the words in editor and click save, I am getting the following error,

Whether the deployment not yet happen for the latest helpers which you have mentioned?

I have added a big ATTENTION to my latest post.

@Sebastian Is this deployed can we use it to auto calculate the words count and save it as a variable in Squidex

I think so, just try it out please.

@Sebastian need help on this one

(i) wordCount Works :grin: but currently only for English :sweat_smile: (other languages field not getting updated) - please note I have created a localised field of type number called words in cloud app 'vannadevd'. Only the english field gets updated rest other languages showing empty

(ii) characterCount is not working

(iii) I think I might be making a mistake in the language codes may be

:zipper_mouth_face::zipper_mouth_face:Analogy of e.g ctx.data.words.en:zipper_mouth_face::zipper_mouth_face:
(I am struggling with language codes in scripting apart from english language maybe that is why it is not working)

i) zh-TW should I use

ctx.data.words.zh-TW or ctx.data.words.tw

ii) zh-HK should I use

ctx.data.words.zh-HK or ctx.data.words.hk

Script Code

           // Word Count in English
            ctx.data.words.en = wordCount(html2Text(ctx.data.body.en));
            replace();
            
            // Word Count in HongKong Traditional Chinese (zh-HK)

            ctx.data.words.hk = wordCount(html2Text(ctx.data.body.hk));
            replace();

            // Word Count in Taiwanese Traditional Chinese (zh-TW)

            ctx.data.words.tw = wordCount(html2Text(ctx.data.body.tw));
            replace();
            
            //Character Count in English

            ctx.data.chars.en = characterCount(html2Text(ctx.data.body.en));
            replace();

Just call replace() once at the end of the script.

@Sebastian my bad… that solved one of the issues

:no_mouth: :no_mouth: :no_mouth: :no_mouth: :no_mouth:
Language problem is not solved. What codes should I use for zh-TW and zh-HK in the script instead of en

ctx.data.body.en

:smiley: :smiley: :smiley: :smiley: :smiley: :smiley: :smiley: :smiley:

Now characterCount works fine Thanks!!!

PFA screenshot for reference

:no_mouth: :no_mouth: Other Languages not updating wordCount:no_mouth: :no_mouth:

:smiley: :smiley:English wordCount Works:smiley: :smiley:

Updated Script Code
// Word Count
ctx.data.words.en = wordCount(html2Text(ctx.data.body.en));

            // Word Count
            (ctx.data.words.zh) = wordCount(html2Text(ctx.data.body.zh));
            
            // Word Count
            (ctx.data.words.tw) = wordCount(html2Text(ctx.data.body.tw));
            
            
            //Character Count
            ctx.data.chars.en = characterCount(html2Text(ctx.data.body.en));
            
            //Required to be called only once at the end of the script
            replace();

exactly the codes that you use. THe problem is that zh-CN is not a valid javascript property name, so therefore you have to use

ctx.data.words['zh-CN']

1 Like

@Sebastian All works like a charm :grin::grin::grin::grin::grin::grin::grin::grin::grin::grin:

FYI Screenshots to prove when things work
Great Support

Taiwanese Word and Character Count

Hongkong Chinese Word and Character Count

Script Code
// Word Count (English)
ctx.data.words.en = wordCount(html2Text(ctx.data.body.en));

            // Word Count (Taiwan)
            ctx.data.words['zh-TW'] = wordCount(html2Text(ctx.data.body['zh-TW']));
            
            // Word Count (HongKong)
            ctx.data.words['zh-HK'] = wordCount(html2Text(ctx.data.body['zh-HK']));
            
            // Word Count (China)
            ctx.data.words['zh-CN'] = wordCount(html2Text(ctx.data.body['zh-CN']));
            
            
            //Character Count (English)
            ctx.data.chars.en = characterCount(html2Text(ctx.data.body.en));
            
            // Character Count (Taiwan)
            ctx.data.chars['zh-TW'] = characterCount(html2Text(ctx.data.body['zh-TW']));
            
            // Character Count (HongKong)
            ctx.data.chars['zh-HK'] = characterCount(html2Text(ctx.data.body['zh-HK']));
            
            // Character Count (China)
            ctx.data.chars['zh-CN'] = characterCount(html2Text(ctx.data.body['zh-CN']));
            
            
            //Required to be called only once at the end of the script
            replace();
1 Like

@Sebastian
:sob:
Though I noticed the word count is failing by an order of 10 for Taiwanese and HongKong (Just see the reference figures in the above screen shot)

I have no idea how you would calculate word count for asian languages, I assumed that it is more or less a character count.

Word count does not understand language semantics.

@Sebastian
I have highlighted the mismatch in Tiny Editor’s word count vs our word count function(Previous screenshot). Plus character count also does not match. I am trying to guess what is the best way to count

How would you calculate the word count for chinese?

@Sebastian I guess I would try to dig in how Tiny does it as both English and Chinese Word counts of same translated articles seems to match)

Yes, I have already done this. Seems to be very sophisticated…

Not sure when I have time for this…

But there are tests :slight_smile:

More for my reference