Need help lowercasing a text with some exceptions

I want to make a function that can lowercase any text which in its self is easy to do. However i cant wrap my head around how to lowercase a text and then still have first Capital letters on words that come after these signs: . ! ?

What i need to do is use conditionals for this and also for-loops.

Example of text would be transforming this text “the Litttle Girl! She Is Annoyed. She Does Not Know Why” needs to be: “the little girl! She is annoyed. She does not know why”

im really lost as to how to solve this :frowning: ( so any help would be appreciated

Sorry if this is too vague.

This seems a little restricted. It would be way easier without those restrictions. (because then you’d just use toLowerCase() and replace() )

Its only required for “conditionals” and “for-loops” for be included in the solving method. However this does not mean i cant use any other functionalities aswell.

I would probably convert all to lowercase, then run a for-loop for each character. When you hit a “!” (or any other character that forces the next to be uppercase) set a Boolean flag to true. When you next hit a letter character, if that Boolean flag is true, then uppercase it and set the flag to false.

Pseudo:code

string = lowercase(orig-string)
flag = false; // or true, if you always want to upcase the first letter
for (a=0; a<len(string); a++) { 
  if (mid(string,a,1) == "!" || mid(string,a,1) == "." || mid(string,a,1)=="?" ) flag = true
  if (flag == true) { 
    if (mid(string, a,1) >= "a" && mid(string,a,1) <= "z") { // use a better check to see if it's a letter
      mid(string,a,1) = uppercase(mid(string,a,1))
      flag = false
      }
   }
}

Seems like a great solution. I just dont know what the “mid” and the “(string,a,1)” does in this case.

that’s the part I’d leave to a RegExp:

function test(str)
{
  return str
    .toLowerCase()
    .replace(/([.!?]\s*)(\w)/g, (m, a, b) => a + b.toUpperCase())
}
1 Like

That’s just pseudo-code from some random language, in this case BASIC. Mid is a function common in that language to extract a part (the “middle”) of string starting at position a and going on for n (in this case 1) characters). string in the above code is the lower-case version of the original string that I called orig-string in the first line. And I’ve used the same for the replace function as well, though not many dialects support using it that way around.

I’ve never managed to get my head around regular expressions, so I didn’t consider that. In any case the OP said they had a need for a for-loop in the finished code.

Here’s the regex example.

The regex part itself is ([.!?]\s*)(\w) where the parenthesis are two capture groups, which will be a and b.
The first capture group is [.!?]\s* which matches any single character inside of the square brackets, followed by all of the spaces after it.
The second capture group is (\w) which matches the next word character.

All of that is inside of /.../g where the slashes are the delimiters of a regular expression.
Normally the regular expression is done only to the first thing that matches, but the g says to do it globally over the whole thing.

2 Likes

I was aware of that, but I wanted to show how much less code it takes when using a RegExp compared to explicit loops & conditions.

i did figure it out in the end by doing this:


function smallCapiltalLetters(text) {
	var lowered = text.toLowerCase();
  var result="";
  for(var i=0; i< lowered.length;i++){
      if(lowered[i] =="." || lowered[i] =="!" ){
      if(i < lowered.length -2){
        result +=lowered[i];
        result +=lowered[i+1];
        result += (""+lowered[i+2]).toUpperCase();
        i++;
        i++;
       }
     }
     else{
         result = result+lowered[i];
       }
  }
  return result;
}

smallCapiltalLetters ("the little Girl. snow. tomorrow! there will be. a meeting! ok!");

Altough the only problem with this is that it will delete the ! after ok!

And if i delete the -2 after lowered.length-2 The output will be UNDEFINEDUNDEFINED

And what if the string doesn’t have a space after the punctuation symbol, or has two?

You’re totally right about that and it has been a problem so the function is limited in that term.

However i also wanted this function to include mainly for-loops and conditionals. As it is intented to showcase what we have learned and put it to use in a simple manner (without Regex and functions i havent learned yet)

i did figure out how to include ! or . at the end of sentence by inserting another else conditional at the end identical to the other else conditional.

That brings up the question: “What have you learned?”
conditional control structure?
control logic?
data types?

Would something like this not do the trick, as per the pseudo-code I posted earlier?

function smallCapiltalLetters(text) {
  var lowered = text.toLowerCase();
  var convert = false;
  // loop through the string
  for(var i=0; i< lowered.length;i++){
    if(lowered[i] =="." || lowered[i] =="!" || lowered[i] == "?"){
      // found one of our triggers, next letter is to be upper-cased
      convert = true;
      }  
    if (convert == true) { 
      // is the trigger set?
      if (lowered[i] >= "a" && lowered[i] <= "z") { 
        // is it a letter? Upcase it.
        lowered[i] = lowered[i].toUpperCase();
        // reset the flag
        convert = false;
        } // it's a letter
      } // flag was set
    } // for-loop
  return lowered;
}   // function

The only thing I’m not sure about (as I’m a bit shaky on javascript) is whether I can do that in-line conversion of a single character to uppercase.

There’s also an issue with whether the string "It was the end! 1 of them would have to leave." should have a capital on the “Of” - my function above would put on there, but it really shouldn’t. Easily fixed, though, by just resetting the flag as soon as a non-space character is found. That said, it’s not really a proper way to write things (it should read “one of them”, not “1 of them”) so maybe it shouldn’t be considered.

I find that using regular expressions help to simplify things dramatically.
Using “The little girl. Snow. Tomorrow! There will be. A meeting! Ok! It was the end! 1 of them would have to leave.” as the example, we can get a sentence by selecting all of the text that ends with a fullstop or an exclamation mark.

A first approach is to use a regular expression /.../ to search for all characters .* up until a one of [...] the punctuation marks .!? are found, followed by a space \s and doing it globally g over the whole sting.

var str = "the little Girl. snow. tomorrow! there will be. a meeting? ok! it was the end!";
var sentenceRx = /.*[.!?]\s)/g;
str.replace(sentenceRx, function (sentence) {
  console.log(sentence);
});

The output from this though shows the full string. We expected one sentence at a time. The .* is the problem here, because the asterisk is greedy. We can tell it to not be greedy and only take the minimum text by adding a question mark, making it .*?

var sentenceRx = /.*?[.!?]\s)/g;

Every sentence is now being output.

Eventually we will return the updated sentence from the function, but for now we can examine the output to find out if what we want to occur is working.

While it’s now possible to make the first character of sentence a capital, and the rest of sentence lowercase, there is an easier way from here.

We can use parenthesis in the regular expression to indicate what are called capture groups. Each capture group is passed in to the function as a separate argument.

One capture group can be placed around the first character (.), and a second capture group around the rest of the sentence (.*?[.!?]\s).

var sentenceRx = /(.)(.*?[.!?]\s)/g;

We can expect the above code to still output each sentence, which it does.

When we now return the combined sentence from the function, we can assign the result to a new variable called initialCapSentence

var str = "the little Girl. snow. tomorrow! there will be. a meeting? ok! it was the end!";
var sentenceRx = /(.)(.*?[.!?]\s)/g;
var initialCapSentence = str.replace(sentenceRx, function (sentence, firstChar, restOfSentence) {
  return firstChar + restOfSentence;
});
console.log(initialCapSentence);

The sentence variable is now not used, so we can just rename it to its default name of match instead.

var initialCapSentence = str.replace(sentenceRx, function (match, firstChar, restOfSentence) {

The last thing to do now is to capitalise the first character, and lowercase the rest.

  return firstChar.toUpperCase() + restOfSentence.toLowerCase();

All of the sentences were done except for the last one. Why not? It doesn’t have a trailing space.
We can tell the regular expression that the trailing space is optional \s? and it will include that last sentence.

var sentenceRx = /(.)(.*?[.!?]\s?)/g;

Are there any other examples of a sentence that might cause trouble? Possibly, and the code can be easily updated to help deal with any when we comes across them.

By breaking up the string in to bite-sized sentences, processing them becomes really easy.

The full code is:

var str = "the little Girl. snow. tomorrow! there will be. a meeting? ok! it was the end!";
var sentenceRx = /(.)(.*?[.!?]\s?)/g;
var initialCapSentence = str.replace(sentenceRx, function (match, firstChar, restOfSentence) {
    return firstChar.toUpperCase() + restOfSentence.toLowerCase();
});
console.log(initialCapSentence);
// The little girl. Snow. Tomorrow! There will be. A meeting? Ok! It was the end!
1 Like

my thoughts exactly, see post #7

I’m hoping that the step by step process, helps to show how the development occurs from an initial idea to the end result.

And it’s a very clear explanation, but my head still explodes by the time I’m a short way down it. But I will come back to it when I can run the code on something and actually see what happens, rather than just trying to picture it.

I suspect regex, for me, will remain one of those things that I can (hopefully) get my head around while I’m using it, but then have to start again from scratch next time I want to try. Much like, it unfortunately turns out, HTML and CSS.

1 Like

Mostly it’s agreed by doing, I agree. There are some good reference places that can help, such as https://regexone.com/ or https://www.regular-expressions.info/tutorial.html

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.