Build Your Own Chrome Extension: a Google Documents Word Count Tool, Part 2

This entry is part 2 of 3 in the series Build a Google Documents Word Count Tool

Build a Google Documents Word Count Tool

Hello, and welcome to Part 2 of our Build your own Chrome Extension tutorial!

In the previous instalment of this series we created a simple Google Chrome extension that adds a persistent word counter to your open Google Documents. Our extension detects when it’s in the context in which it should activate, and periodically runs that part of itself which helps it count the number of words in the document. While not very accurate, it was a decent and usable first version which served its original purpose well.

Now let’s have a look at how we can improve it slightly. In this part of the series, we’ll update the manifest file, add a page counter, add some precision to the word counter and last but not least completely remove jQuery and replace it with vanilla JavaScript.

Let’s get started!

Updating the Manifest File

As pointed out by Michael in the previous instalment’s comments section, and Mohamed Mansour in my Google+ post, Google is gradually phasing out support for manifest v1 files.

The manifest files, as previously mentioned, are files that describe to Chrome what an extension does, how it does it, and whether or not it should be allowed to do something. The directives in these files often change with new versions as the web develops and Google Chrome adds new developer features, so in order to keep our extension “marketable” on the Chrome Web Store, we need to add the following line anywhere in our manifest file:

 "manifest_version" : 2, 

I added it immediately under the “version”. While we’re at it, let’s bump up the version value to 0.2.

The new version of the manifest, however, has some additional requirements. We now need to list all the resources we’ll be loading “on the side” in a web_accessible_resources key. Without it, our statusbar.html won’t load and will throw the error “Denying load of chrome-extension://…/statusbar.html. Resources must be listed in the web_accessible_resources manifest key in order to be loaded by web pages.” To avoid this, we simply add the following content to our manifest file:

 "web_accessible_resources" : ["statusbar.html"] 

That’s it! If you try reloading the extension now (as per part one), everything should go as expected and no warnings should be shown.

Adding a Page Counter

We learned last time that the “row” element in a Google document is a span with the class “kix-lineview-text-block”. Upon further inspection, we learn that the element that contains the actual page is, predictably, a div with the class “kix-page”. As such, it should be no trouble at all adding a page counter to our word counter.

Change the content of the countWords() method in main.js to the following:

var pageCount = $('div.kix-page').length; 
var wordCount = 0; 
$('span.kix-lineview-text-block').each(function(i, obj){ 
  wordCount += $(obj).text().split(/s+/).length; 
}); 
$('span#GDWC_wordsTotal').text(pageCount + ' pages, ' + wordCount + ' total words'); 
timeout = setTimeout('countWords()', 5000); 

As you can see, we’ve added a new variable, pageCount. Since there’s nothing to break apart and the elements are already defined as pages, all we have to do is count them by using the length property. We then simply prepend the “page number” message to our “total words” message, and we’re set. Feel free to reload the extension and give it a go.

Adding Precision to the Word Counter

You may have noticed that our word counter uses a simple space to break apart strings and figure out word counts. Let’s make it slightly more precise by changing this line of the countWords() function:

wordCount += $(obj).text().split(/s+/).length; 

to

words = $(obj).text().match(/S+/g); 
wordCount += words && 
words.length || 0; 

Instead of splitting, which would count inaccurately unless the document ended in a space character, we now globally match every series of non-space characters. This means every character that is not a whitespace character is being interpreted as a word, which is a little bit closer to the human definition of “word” as well.

It is important to note that Google Docs loads content dynamically: that is, only on request. Thus, when you first start up a document that has some content in it already, first scroll through it all and return to the top, so that the browser receives the entire document’s data.

But what if we wanted to exclude punctuation and other symbols from triggering a word count increment as well? All those “…”, commas, periods and runaway apostrophes might offset the proper count and we’d be better off without them. Let’s replace the line

words = $(obj).text().match(/S+/g); 

with

var words = $(obj).text().replace(/W+/g, ' ').match(/S+/g); 

What we did there was replace every set of one or more non-alphanumeric characters with a single space. This means “…” and “###” become a single space, just like commas, periods and other symbols, thus not counting as words. While this does add precision in removing trash characters, it removes some precision in counting strings such as dates. For example, 1998.03.05 will become 1998 03 05, thus counting as three words. This introduces some new difficulties which we might tackle in the next installment. For now, let’s leave it at this.

Removing jQuery

While this isn’t as important for Chrome Extensions as it is for websites since all the files are downloaded to the client and kept there (there is no remote downloading of jQuery every time you run the extension), removing jQuery will decrease our file size and memory footprint slightly and allow us to look at some alternative JavaScript syntax. Also, since we don’t need the backwards compatibility and cross-browser operability of jQuery due to building this extension only for Google Chrome, having its entire functionality is kind of an overkill.

Since main.js is our only file that contains jQuery, let’s open it now and start with the very first command — the ajax call to our statusbar.html file. Change

 $.get(chrome.extension.getURL("statusbar.html"), {}, function(data) {$('body').append(data);}, 'html'); 

to

var xhReq = new XMLHttpRequest(); 
xhReq.onreadystatechange = function onSumResponse() { 
  if (xhReq.readyState == 4) { 
    var serverResponse = xhReq.responseText; 
    var body = document.getElementsByTagName("body")[0]; 
    var div = document.createElement('div'); 
    div.innerHTML = serverResponse; 
    body.appendChild(div); 
  } 
} 
xhReq.open("GET", chrome.extension.getURL("statusbar.html"), true); 
xhReq.send(null);

Hmm, we turned an extremely simple line of code into a mess. Or did we? This is basically what our previous code did — it merely served as a wrapper for a call identical to this one. So while this is a little bit more code, in retrospect it actually causes less code to be run because there’s no overhead of calling jQuery which in turn needs to decide which XHR wrapper to call next, etc. So what does this mess do? First, it instantiates an XMLHttpRequest, which is a JS object “used to send HTTP or HTTPS requests directly to a web server and load the server response data directly back into the script”. Essentially, it’s the thing that performs the Ajax call. We then make sure that when its readyState property changes to 4 (ready), it fetches the text of our response (our statusbar), injects it into an empty div and appends this div to the end of “body”. Finally, we start the request with open() and send().

Let’s turn our focus to checking if the document is ready for use now. Replace

$(document).ready(function(){ 
  countWords(); 
});

with

var readyStateCheckInterval = setInterval(function() { 
  if (document.readyState === "complete") { 
    countWords(); 
    clearInterval(readyStateCheckInterval); 
  } 
}, 10);

This snippet removes jQuery’s method of checking if the document is ready for manipulation, and creates an interval check that checks whether or not the document is ready every 10ms. Once it detects that it is, it calls countWords(), clears the interval and the checking stops.

Now, let’s see what we can do about the pageCount variable. Replace

var pageCount = $('div.kix-page').length; 

with

var pageCount = 0; 
var divs = document.getElementsByTagName('div'), i; 
for (i in divs) { 
  if((" " + divs[i].className + " ").indexOf(" kix-page ") > -1) { pageCount++; } 
}

This fetches all the divs in a website and sees if their class property contains ours.

Now let’s replace the jQuery span loop which word-counted the lines with a home-made one. Replace

$('span.kix-lineview-text-block').each(function(i, obj){ 
  var words = $(obj).text().replace(/W+/g, ' ').match(/S+/g); 
  wordCount += words && 
  words.length || 0; 
});

with

var spans = document.getElementsByTagName('span'), i; 
for (i in spans) { 
  if((" " + spans[i].className + " ").indexOf(" kix-lineview-text-block ") > -1) { 
    var words = spans[i].innerText.replace(/W+/g, ' ').match(/S+/g); 
    wordCount += words && 
    words.length || 0; 
  } 
}

Finally, we can replace

$('span#GDWC_wordsTotal').text(pageCount + ' pages, ' + wordCount + ' total words');

with

document.getElementById('GDWC_wordsTotal').innerText = pageCount + ' pages, ' + wordCount + ' total words';

… to actually display the message without jQuery. Of course, we also need to remove the loading of jQuery from the extension manifest, so change

"js": ["jq.js","main.js"],

into

"js": ["main.js"],

and feel free to delete the jq.js file.

Conclusion

In this, the second part of a three-part series on creating a Google Chrome extension, we took a look at how to modify our extension slightly in order to make it perform faster and bring it up to Google’s newest development standards. We added some precision to our word counter, implemented a page counter along side the word count, brought the manifest file up to date with some new required directives and a version declaration and we undertook the gargantuan task of converting our jQuery code to vanilla JavaScript, thus gaining on speed, memory usage and reduced file size. In the next and last instalment of this series, we’ll further upgrade the performance of our extension and add some more helpful functionality to the statusbar itself. Stay tuned!

Build a Google Documents Word Count Tool

<< Build Your Own Chrome Extension: a Google Documents Word Count Tool, Part 1Build Your Own Chrome Extension: a Google Documents Word Count Tool, Part 3 >>

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://crakken.com Nick Rameau

    Cool 8)

  • http://javascriptphp.com Yitz Meirovich

    Thanks. This was really helpful. I’ve wanted to create my own extensions for a while and this gets me started. One suggestion though. You’re using a “for in” loop instead of the for / while loop structures. I know you’re enumerating objects, but I’m concerned that on a large document you are going to have a significant performance hit. If you check out http://jsperf.com/orderedloop you’ll see that “for in” has the worst performance in relation to other structures. In researching loop optimization further I’ve found three key characteristics for optimization: 1) use of a “for” or “while” loop structure, 2) use of –i over i++ or even i–, 3) cache the length of the loop and don’t run it in the test itself as in

    var values = [1,2,3,4,5];
    var length = values.length;

    for (var i=length; i–;) {
    console.log(values[i]);
    }

    Or

    var values = [1,2,3,4,5];
    var i = values.length;

    /* i is 1st evaluated and then decremented, when i is 1 the code inside the loop
    is then processed for the last time with i = 0. */
    while(i–)
    {
    //1st time in here i is (length – 1) so it’s ok!
    console.log(values[i]);
    }

    Or even

    var values = [1,2,3,4,5];

    for (var i=0, len=values.length; i<len; i++) {}

    also check out:

    http://jonraasch.com/blog/10-javascript-performance-boosting-tips-from-nicholas-zakas
    https://blogs.oracle.com/greimer/entry/best_way_to_code_a

    • http://about.me/bruno.skvorc Bruno Skvorc

      This is brilliant advice, thank you very much! I’ll make edits to the article as soon as possible, very valuable resources you’ve posted there