🤯 50% Off! 700+ courses, assessments, and books

Counting with an Arbitrary Character Set

    James Edwards
    Share

    Something small and uncontroversial this week, as we look at a simple yet flexible technique for counting with an arbitrary character set. It’s not something you’re likely to need very often; but when you do, you’ll find that none of JavaScript’s built-in functions are quite designed to handle it.

    JavaScript does have built-in functions for parsing and converting numbers between different numerical bases. For example, the parseInt method can work with any radix (numerical base) from 2 to 36, and is commonly used for number conversion and counting in non-decimal bases. The Number.toString method can reciprocate, converting decimal numbers back to non-decimal number-strings:

    var character = "2F";
    alert(parseInt(character, 16));    //alerts 47
    
    var number = 47;
    alert(number.toString(16));        //alerts "2F";

    But what if you wanted to count using Klingon numerals? Or more likely perhaps, using Greek or Cyrillic letters, hieroglyphics, or some kind of runes? The technique I’m going to demonstrate can do exactly that, in any numerical base; and to illustrate this fully, I’ll show you some examples of working with upper-case Greek letters in hexadecimal (base 16).

    It’s All in the Lexicon

    So the very first thing we need to do is define a lexicon, which is a dictionary of the characters we’ll be using, defined as a single string of unicode escape-sequences. In this case, we have 16 upper-case Greek letters, from Alpha to Pi — each digit is represented by a letter, and the length of the overall string determines the numerical base:

    var lexicon = "u0391u0392u0393u0394u0395u0396u0397u0398u0399u039au039bu039cu039du039eu039fu03a0";

    An Escape Sequence is One Character

    It’s worth noting that, even though it takes six typed-characters to define a unicode escape sequence, it still only shows up as one character in the string, and therefore the lexicon is 16 characters long.

    Once we have the lexicon, we can refer to a character by numerical index using String.charAt, and conversely, get the numerical index of a character using String.indexOf:

    var number = lexicon.indexOf("u0398");    //the decimal equivalent of "Θ" 
    
    var character = lexicon.charAt(7);         //the character equivalent of 7

    So any computations we do will be based on those two methods. For example, let’s define a for-loop that runs for "Κ" iterations, and lists each character in-between:

    var str = "";
    for(var i=0; i<lexicon.indexOf("u039a"); i++)
    {
        str += lexicon.charAt(i) + "n";
    }
    alert(str);

    But what about larger numbers, say, displaying the character equivalent of 23? We simply have to extract the individual digits, and then grab the character equivalents, in this case 2 and 3:

    var target = 23;
    
    var conversion = lexicon.charAt(Math.floor(target / 10))
                   + lexicon.charAt(target % 10);
    
    alert(conversion);

    Just to make things really interesting, what if the number we want to convert contains letters as well as numbers, such as the hex number "2F"? In that case we’d have to convert each digit individually, because we can’t refer to a character by hexadecimal index (ie. lexicon.charAt("F") would have to become lexicon.charAt(15)):

    var target = "2F";
    
    var conversion = lexicon.charAt(parseInt(target.charAt(0), 16))
                   + lexicon.charAt(parseInt(target.charAt(1), 16));
    
    alert(conversion);

    Of course, the last two examples are fairly simplistic, because the number of digits is known; but it wouldn’t be difficult to adapt the process to iterate through as many digits as the number contains. All the components you need are here, it’s just a case of adapting them for your precise requirements.

    It’s the Data That counts!

    As it happens, you can use exactly the same approach to count using normal Latin numerals and letters, should the need arise. And the extensible nature of the lexicon means you can use it to extend JavaScript’s native abilities into radixes greater than 36, with whatever symbols seem appropriate at the time.

    Or maybe just to develop some funky clocks!

    note:Want more?

    If you want to read more from James, subscribe to our weekly tech geek newsletter, Tech Times.