Title Case in PHP

By | | PHP

The PHP functions strtoupper and ucwords capitalise all characters in a string, and the first letter of every word in a string, respectively. However, there exists in the standard PHP library no way of achieving Title Case, which involves capitalising all words except for small words (such as conjunctions) when they are not the first word.

The following mini-tutorial will provide a solution similar to the one I came up with for displaying thread titles in the SitePoint Forums’ Highlighted Forum Discussions.

First, we’ll need a list of all the words which we don’t want to capitalise when they are not the first word. The words we should be concerned about are conjunctions (such as and, or), prepositions (in, on) and internal articles (the, a). However, I’m no expert in English grammar so I’ll just call them ‘small words’. Here’s a selection of the ones I use.

$smallwordsarray = array( 'of','a','the','and','an','or','nor','but','is','if','then','else','when', 'at','from','by','on','off','for','in','out','over','to','into','with' );

The explode function in PHP can be used to split any string into an array of strings, using a character as a split character. So if we split with a space character (‘ ‘), we can use explode to split our string into words.

$words = explode(' ', 'this is a title');

$words becomes an array of strings, each string representing one word from the original $title. Here it would be equal to array(‘This’, ‘is’, ‘a’, ‘title’).

We can now operate on each word separately. We need to test each word to determine if it is one of our ‘small words’. If it is not a small word, or it’s the first word, it should be capitalised. To operate on each member of an array in turn, we can use the PHP language construct foreach. To check if a word is one of our small words, we can use in_array

foreach ($words as $key => $word) { if (!$key or !in_array($word, $smallwordsarray)) $words[$key] = ucwords($word); }

Notice here that I assigned the value to $words[$key] and not to $word. The reason for this is that $word was created by the foreach statement. Modifying $word will not cause any change to the original array $words. So I need to modify the entry in the original array $words that corresponds to the current array key.

By this point, we have an array of all the words in our title. Those words which should be capitalised have been, and all that’s left is to join the words together again into a single string. This can be done with implode, which works in the opposite direction to explode.

$newtitle = implode(' ', $words);

Here’s the entire script as a function you’re welcome to use in your application.

function strtotitle($title) // Converts $title to Title Case, and returns the result. { // Our array of 'small words' which shouldn't be capitalised if // they aren't the first word. Add your own words to taste. $smallwordsarray = array( 'of','a','the','and','an','or','nor','but','is','if','then','else','when', 'at','from','by','on','off','for','in','out','over','to','into','with' ); // Split the string into separate words $words = explode(' ', $title); foreach ($words as $key => $word) { // If this word is the first, or it's not one of our small words, capitalise it // with ucwords(). if ($key == 0 or !in_array($word, $smallwordsarray)) $words[$key] = ucwords($word); } // Join the words back into a string $newtitle = implode(' ', $words); return $newtitle; }

Notice that if the input already contains capital letters, those letters will remain. This ensures that letters that must always be capital, such as acronyms, will remain so.

Try it out with a title, such as

echo strtotitle("this is a title");

The result? “This is a Title”.

As an aside, a quick search on Google found a solution to the same problem in JavaScript. However, it is a nightmare of code – unnecessarily complex. It checks every letter separately! You’d be better off porting my script to JavaScript.

By the way, here’s a similar solution in ColdFusion.

{ 39 comments }

amber.long83 December 13, 2008 at 7:34 pm

nice function. thanks

JimS December 27, 2007 at 7:29 am

My god… what a bunch of whiners (for the most part). And a lot of non-programmers who want to complain about improper grammar. It’s a tool. Use it. Don’t use it. Modify as needed.

And ddzoom…
If you modify ALLCAP titles then the previous example where “A Title About HTML Classes” would obviously damage HTML if it was inputed properly the first time around.

Jim

Edan November 8, 2007 at 9:08 pm

Thanks for the code.
Even if it doesn’t provide me a complete grammar resource -
I found it very useful.

Anyway, for the last word problem, I added this line -

$words[array_pop(array_keys($words))] = ucwords($word);

ddzoom April 19, 2007 at 3:01 am
function titlecase($s) {
 $GLOBALS['lc'] = array('a', 'an', 'and', 'as', 'at', 'but', 'by', 'de', 'down', 'for', 'from', 'has', 'in', 'into', 'near', 'nor', 'of', 'off', 'onto', 'or', 'out', 'over', 'per', 'so', 'than', 'the', 'to', 'unto', 'upon', 'van', 'via', 'von', 'with', 'yet'); // , 'up'
 // $GLOBALS['lc'] = array('a', 'an', 'and', 'as', 'but', 'de', 'nor', 'of', 'or', 'per', 'the', 'van', 'via', 'von'); // minimal
 $s = flatten_whitespace($s);
 if (strtoupper($s) == $s) $s = ucwords($s); // disallow ALL CAPS TITLES (shouting)
 $s = preg_replace("/\b(['\w]+?)(?!['\w])/e", 'in_array(strtolower("\1"), $GLOBALS["lc"]) ? strtolower("\1") : (strtoupper("\1") == "\1" ? "\1" : ucfirst(strtolower("\1")));',
 $s); // words in list to lower, else Capitalize unless ALL-CAPS
 $s = preg_replace("/^(\W+)?(['\w]+?)(?!['\w])/e", '"\1" . (strtoupper("\2") == "\2" ? "\2" : ucfirst("\2"))',
 $s); // Capitalize first word
 $s = preg_replace("/(?

mmmmmm….

ddzoom April 19, 2007 at 2:58 am

part 2…

 // Capitalize last word
 $s = preg_replace("/(?

ddzoom April 19, 2007 at 2:55 am

try again…

function titlecase($s) {
 $GLOBALS['lc'] = array('a', 'an', 'and', 'as', 'but', 'de', 'nor', 'of', 'or', 'per', 'the', 'van', 'via', 'von'); // minimal
 $s = flatten_whitespace($s);
 // disallow ALL CAPS TITLES (shouting)
 if (strtoupper($s) == $s) $s = ucwords($s);
 // words in list to lower, else Capitalize unless ALL-CAPS
 $s = preg_replace("/\b(['\w]+?)(?!['\w])/e", 'in_array(strtolower("\1"), $GLOBALS["lc"]) ? strtolower("\1") : (strtoupper("\1") == "\1" ? "\1" : ucfirst(strtolower("\1")));',
 $s);
 // Capitalize first word
 $s = preg_replace("/^(\W+)?(['\w]+?)(?!['\w])/e", '"\1" . (strtoupper("\2") == "\2" ? "\2" : ucfirst("\2"))',
 $s);
 // Capitalize last word
 $s = preg_replace("/(?

ddzoom April 19, 2007 at 2:53 am

I spent an afternoon playing with this, and I’m sorry but your approach seems to be way too simplistic.

The following code gets a little closer, splitting words at (non-apostophe) boundaries…

function flatten_whitespace($s) { return preg_replace('/\s+/', ' ', $s); } // util
function titlecase($s) {
 $GLOBALS['lc'] = array('a', 'an', 'and', 'as', 'at', 'but', 'by', 'de', 'down', 'for', 'from', 'has', 'in', 'into', 'near', 'nor', 'of', 'off', 'onto', 'or', 'out', 'over', 'per', 'so', 'than', 'the', 'to', 'unto', 'upon', 'van', 'via', 'von', 'with', 'yet'); // , 'up'
 // $GLOBALS['lc'] = array('a', 'an', 'and', 'as', 'but', 'de', 'nor', 'of', 'or', 'per', 'the', 'van', 'via', 'von'); // minimal
 $s = flatten_whitespace($s);
 // disallow ALL CAPS TITLES (shouting)
 if (strtoupper($s) == $s) $s = ucwords($s);
 // words in list to lower, else Capitalize unless ALLCAPS
 $s = preg_replace("/\b(['\w]+?)(?!['\w])/e", 'in_array(strtolower("\1"), $GLOBALS["lc"]) ? strtolower("\1") : (strtoupper("\1") == "\1" ? "\1" : ucfirst(strtolower("\1")));',
 $s);
 // Capitalize first word
 $s = preg_replace("/^(\W+)?(['\w]+?)(?!['\w])/e", '"\1" . (strtoupper("\2") == "\2" ? "\2" : ucfirst("\2"))',
 $s);
 // Capitalize last word
 $s = preg_replace("/(?

Gian Luca March 6, 2007 at 3:06 am

What about hyphenated words like: “A client-centered approach”

Aren’t you supposed to capitalize each portion of the hyphenated word?

codegreen March 2, 2007 at 4:42 pm

Awesome… thanks guys for several good solutions.

Nood February 2, 2007 at 8:10 pm

Why such a long function just to do this!

function titleCase ($str) {
return ucwords(strtolower($str));
}

Anonymous November 29, 2006 at 11:43 pm

The comment I just posted got clipped. Please ignore.

Anonymous November 29, 2006 at 11:35 pm

Because the rules of capitalization vary greatly, I came up with a list of pretty general rules that are most common and programmed to those specs.

Capitalize Everything except:
1) First and last word of a title
2) Articles
3) Prepositions
4) Coordinating Conjunctions
5) Italian Words (I use these a lot)
6) Special Words (“iPod”,”iMovie”,”iTunes”,www,http)

As Shiflett pointed out, it is close to impossible to determine which class a word belongs to, as how it is used in a sentence determines that. But this is the closest I’ve gotten to.

function caseTitle($title){
 $articles = array("the", "a", "an");
 //List of prepositions obtained from Wikipedia.
 $prepositions = array("aboard" , "about" , "above" , "absent" , "across" , "after" , "against" , "along" , "alongside" , "amid" , "amidst" , "among" , "amongst" , "into " , "onto" , "around" , "as" , "astride" , "at" , "atop" , "before" , "behind" , "below" , "beneath" , "beside" , "besides" , "between" , "beyond" , "by" , "despite" , "down" , "during" , "except" , "following" , "for" , "from" , "in" , "inside" , "into" , "like" , "mid" , "minus" , "near" , "nearest" , "notwithstanding" , "of" , "off" , "on" , "onto" , "opposite" , "out" , "outside" , "over" , "past" , "re" , "round" , "since" , "through" , "throughout" , "till" , "to" , "toward" , "towards" , "under" , "underneath" , "unlike" , "until" , "up" , "upon" , "via" , "with" , "within" , "without" , "anti" , "betwixt" , "circa" , "cum" , "per" , "qua" , "sans" , "unto" , "versus" , "vis-a-vis" , "concerning" , "considering" , "regarding");
 $twoWordPrepositions = array("according to" , "ahead of" , "as to" , "aside from" , "because of" , "close to" , "due to" , "far from" , "in to" , "inside of" , "instead of" , "on to" , "out of" , "outside of" , "owing to" , "near to" , "next to" , "prior to" , "subsequent to");
 $threeWordPrepositions = array("as far as" , "as well as" , "by means of" , "in accordance with" , "in addition to" , "in front of" , "in place of" , "in spite of" , "on account of" , "on behalf of" , "on top of" , "with regard to" , "in lieu of");
 $coordinatingConjuctions=array("for","and","nor","but","or","yet","so");
 $italian = array("il", "lo", "la", "gli", "le", "uno", "una", "e",  "d", "l", "s", "un", "di", "da", "su", "tra", "fra", "del", "dello", "della", "dei", "degli", "delle", "al", "allo", "alla", "ai", "agli", "alle","da", "dal", "dallo", "dalla", "dai", "dagli", "dalle", "nel", "nello", "nella", "nei", "negli", "nelle", "col", "collo", "colla", "coi", "cogli", "colle","su", "sul", "sullo", "sulla", "sui", "sugli", "sulle", "pel", "pei");
 //converts words from their array element into their key
 $specialWords = array('iPod'=>'ipod','iMovie'=>'imovie','iTunes'=>'itunes','www'=>'www','http'=>'http');
 //merge the 'small words' list into one array. Allows you to add new libraries of words
 $lowercaseArray = array_merge($articles,$prepositions,$coordinatingConjuctions,$italian);
 $title = ucwords($title);
 $wordArray=explode(' ',$title);
 $numWords = count($wordArray);
 $i=0;
 foreach($wordArray as $word){
  //lower case entire word as in_array function is case sensitive
  $word=strtolower($word);
  //Sets the word to lower case if the word is found in the list of lowercase words
  if(in_array($word,$lowercaseArray)){
   $wordArray[$i]=$word;
  }
  //We use this approach for multiword prepositions and not preg_replace because:
  //e.g "aside from" could be apart of "It is 15 minutes to the seASIDE FROM Venice"
  //Thought of adding a space in front or at the back of the word in the array but a sentence may start or end with a preposition "ASIDE FROM me, Alice has...", "We're not sure what it is NEAR TO"
  if($i>0){
   $twoWord=strtolower($wordArray[$i-1]." ".$wordArray[$i]);
   if(in_array($twoWord,$twoWordPrepositions)){
    $wordArray[$i]=strtolower($wordArray[$i]);
    $wordArray[$i-1]=strtolower($wordArray[$i-1]);
   }
  }
  if($i>1){
   $threeWord=strtolower($wordArray[$i-2]." ".$twoWord);
   if(in_array($threeWord,$threeWordPrepositions)){
    $wordArray[$i]=strtolower($wordArray[$i]);
    $wordArray[$i-1]=strtolower($wordArray[$i-1]);
    $wordArray[$i-2]=strtolower($wordArray[$i-2]);
   }
  }
  $i++;
 }
 //This section include universal rules that override any changes that may have been made above
 for($i=0;$i

Fabiano Shark September 25, 2006 at 4:28 am

r0x a l0t… keep writing xD

Rob Cluett July 10, 2006 at 7:36 am

This is great. I’m using it now.

TWO POINTS:

1) Initiliaze $smallwordsarray prior to populating it with keywords like this: $smallwordsarray = array();

2) The full cose listed above in this procedure does not match the code written and commented on. For example:

if ($key or !in_array($word, $smallwordsarray))

should be

if (!$key or !in_array($word, $smallwordsarray))

Good stuff and thanks for this post.

robuk83 April 12, 2006 at 7:57 pm

Hey,

Something went wrong with the last post so just visit the link in the previous post:
http://uk.php.net/manual/en/function.ucwords.php#60064

Regards,

Rob

robuk83 April 12, 2006 at 7:53 pm

Hi all,

This is a great topic. I am having troubles with this problem myself.

This is the best solution I have come up with so far:
http://uk.php.net/manual/en/function.ucwords.php#60064


function my_ucwords($str, $is_name=false) {
// exceptions to standard case conversion
if ($is_name) {
$all_uppercase = '';
$all_lowercase = 'De La|De Las|Der|Van De|Van Der|Vit De|Von|Or|And';
} else {
// addresses, essay titles ... and anything else
$all_uppercase = 'Po|Rr|Se|Sw|Ne|Nw';
$all_lowercase = 'A|And|As|By|In|Of|Or|To';
}
$prefixes = 'Mc';
$suffixes = "'S";

// captialize all first letters
$str = preg_replace('/\\b(\\w)/e', 'strtoupper("$1")', strtolower(trim($str)));

if ($all_uppercase) {
// capitalize acronymns and initialisms e.g. PHP
$str = preg_replace("/\\b($all_uppercase)\\b/e", 'strtoupper("$1")', $str);
}
if ($all_lowercase) {
// decapitalize short words e.g. and
if ($is_name) {
// all occurences will be changed to lowercase
$str = preg_replace("/\\b($all_lowercase)\\b/e", 'strtolower("$1")', $str);
} else {
// first and last word will not be changed to lower case (i.e. titles)
$str = preg_replace("/(?

Hope this helps you guys!

Rob

guest April 10, 2006 at 3:14 am

What about converting MCTAVISH to McTavish! mb-convert-case gives you Mctavish. Is there a simple way around this?

chefchops February 17, 2006 at 12:50 pm

simple css

h1 {
text-transform:capitalize;
}
lol

Aprotim October 12, 2005 at 11:27 am

“a view to a kill”
“come on, eileen”
“come in”

Renee June 7, 2005 at 4:06 pm

Hmm, here’s what the Gregg says on the rules of capitalization. I’d agree that there’s no simple subroutine that can handle this. Aside from the simple differentiation between a preposition and adjective – there’s too many exceptions like ‘Up-to-Date’ and ‘dBase’ that break the following rules.

(360) Capitalize all words with four or more letters. Also, capitalize all words with fewer than four letters, except articles, short conjunctions and short prepositions.

(361) (a) Capitalize the first and last word of a title; (b) capitalize the first word following a dash or colon; (c) capitalize short words in titles that serve as adverbs rather than prepositions; (d) capitalize short prepositions when used with prepositions of four-or-more letters (such as “Sailing Up and Down the St. Lawrence”); (e) do not capitalize word-wraped words.

(362) Do not capitalize a title when it is incorporated into a sentence as a descriptive phrase (such as “In his book on /economics/, Samuelson points out that…”

Sarvesh May 3, 2005 at 7:17 pm

Thanks for this handy function. A little bug that I encountered – if I passed text in all upper case, function didn’t do anything. So to get over this I added

$title = strtolower($title);

shiflett March 16, 2005 at 1:02 am

> The result? “This is a Title”.

Is this really proper grammar somewhere? It certainly isn’t in America, so perhaps this highlights the need to consider locale. The proper capitalization is:

This Is a Title

This particular error can be corrected by removing “is” from your array, because it’s always a verb. Where did you read that verbs should not always be capitalized?

Another error is that you don’t capitalize the last word. Surely this is a pretty universal rule.

These two errors are pretty easy to correct, but there are others that aren’t so easy:

One If by Land
(capitalize subordinating conjunctions)

One Flew Over the Cuckoo’s Nest
(capitalize adverbs)

In fact, it’s no simple task to determine when something is an adverb. The following title shows a situation when “over” is not an adverb:

Victory over the Darkness

I’ve not used mb_convert_case(), but I bet it comes closer to being accurate than strtotitle(), especially since it considers the encoding.

If you want to implement something like this in PHP, I think it’s best to start with something like the MLA Handbook. Focus on the rules that you’re trying to enforce rather than the way you implement them in PHP.

I’m no grammar expert, but several flaws in this implementation are pretty obvious. Again, maybe it’s just because we’re in different countries, but I certainly couldn’t use this function for anything. :-)

mmj March 15, 2005 at 7:12 pm

Yet looking at the length of the code, I’d like to just point out that for display purposes only, the same thing can be achieved using the CSS text-transform property. Its a lot easier to implement and works in most browsers (even IE).

elementality, as far as I know there is no way to achieve title case in CSS. You can capitalise all letters (like strtoupper) or all words (like ucwords) but you can’t do title case.

http://www.w3.org/TR/REC-CSS2/text.html#propdef-text-transform

CubitGuy March 15, 2005 at 6:23 pm

Uhh..one liner? Is that one line? ;-)

Lachlan March 15, 2005 at 5:14 pm

I use this one-liner:


function strtotitle($title) {
return trim(ucfirst(str_replace(array("Of ","A ","The ","And ",
"An ", "Or ", "Nor ","But ","If ","Then ","Else ","When ","Up ",
"At ","From ","By ","On ","Off ","For ","In ","Out ","Over ",
"To "),
array("of ","a ","the ","and ","an ","or ","nor ","but ","if ","then ",
"else ","when ","up ","at ","from ","by ","on ","off ","for ","in ",
"out ","over ","to "),ucwords(strtolower($title)))));
}

mrsmiley March 15, 2005 at 4:55 pm

Ok, it got the better of me, had to see if it indeed was better or not. Make your own mind up on this on. The downside to the concept of this function is that you have to maintain the list of small words, which requires you to know the possible languages used on your site.


function strtotitle($title)
// Converts $title to Title Case, and returns the result.
{
// Our array of 'small words' which shouldn't be capitalised if
// they aren't the first word. Add your own words to taste.
$smallwordsarray = array(
'of'=>'Of', 'a'=>'A', 'the'=>'The', 'and'=>'And', 'an'=>'An', 'or'=>'Or', 'nor'=>'Nor',
'but'=>'But', 'is'=>'Is', 'if'=>'If', 'then'=>'Then', 'else'=>'Else', 'when'=>'When',
'at'=>'At', 'from'=>'From', 'by'=>'By', 'on'=>'On', 'off'=>'Off', 'for'=>'For',
'in'=>'In', 'out'=>'Out', 'over'=>'Over', 'to'=>'To', 'into'=>'Into', 'with'=>'With'
);

// Split the string into separate words
$title = ucwords($title);
$words = explode(' ', $title);

if (!is_array($words))
return false;

$len = count($words);

// Ignore the first word, so start at index=1
for ($i = 1; $i < $len; $i++)
{
// If this word is a small word, fix the case
if (($key = array_search($words[$i], $smallwordsarray)) !== (false || null))
$words[$i] = $key;
}

// Join the words back into a string
$newtitle = implode(' ', $words);

return $newtitle;
}

mrsmiley March 15, 2005 at 4:28 pm

Just thinking about performance of this. I haven’t thought too much on it, but if your title typically contain more words you want to capitalise than “small words”, then you might want to consider running ucwords over the entire string BEFORE you explode it into bits. Then just make the “small words” all lower case. That way you are only call ucwords once, and not for every word you want to captialise. You could minimise that further by just replacing the “small word” with the version in the array which is already in lower case. Hence voiding the need to call strtolower on every “small word”.

Just a thought. May or may not be an issue if you are converting a lot of strings.

elementality March 15, 2005 at 3:26 pm

good thing to know and I’m sure for specific uses this will be very handy.

Yet looking at the length of the code, I’d like to just point out that for display purposes only, the same thing can be achieved using the CSS text-transform property. Its a lot easier to implement and works in most browsers (even IE).

Obviously, if you need the text to physically be a specific case the code you have provided is invaluable, but if you are only worried about how it displays, CSS I think works just fine.

CubitGuy March 15, 2005 at 3:02 pm

LOL @ “Personally, I prefer to employ experienced editors who know how to use a shift key.” You and me both, unfortunatly we don’t always get someone that smart, ya know? I’d rather have this there to at least double check everything.

Thanks for another great tip. Looking forward to more mini-tutorials!

Itshim March 15, 2005 at 1:36 pm

Thank you for the function and explanation. I have not needed to perform an operation like this in a project yet… but I

Greg March 15, 2005 at 1:05 pm

This, of course, won’t work with certain proper nouns/product names like ‘iPod,eMachines,iPaq rx3715,etc’ which need to have the first letter lower-case.

Wez Furlong March 15, 2005 at 10:24 am

Also check out: http://www.php.net/manual/en/function.mb-convert-case.php, which is unicode aware.

Anonymous March 15, 2005 at 10:01 am

The title of this should be, ‘sorta create a title in English, but not if the title contains domain names, uppercased characters and which doesn’t handle certain names, or words containing apostrophies as far as I can tell’. But hey, could be useful to someone. Personally, I prefer to employ experienced editors who know how to use a shift key.

Jon March 15, 2005 at 9:33 am

You have the word “is” in your array; since “is” is a verb, it should be capitalized. At least, that’s how I was taught.

David March 15, 2005 at 7:33 am

You might want to do “return ucfirst($newtitle);” instead, to always get the first letter of the first word of the text in UC. (or refine it more to get every first word in a sentence)

Why don’t you use str_ireplace()? Like:

ucfirst(str_ireplace($smallwordsarray, $smallwordsarray,ucword($text)))

mmj March 15, 2005 at 7:22 am

I’d further refine it to ignore any words already in uppercase.
You wouldn’t want it to mangle something like:
An Introduction To HTML

Hi Richard,

The code above will not change any letters that are already capitalised such as ‘HTML’, because that is the behaviour of ucwords.

jez March 15, 2005 at 5:32 am

Very helpful – thank you!

Richard@Home March 15, 2005 at 4:44 am

I’d further refine it to ignore any words already in uppercase.

You wouldn’t want it to mangle something like:

An Introduction To HTML

aaron wormus March 15, 2005 at 3:01 am

or for poor people:

$string = “the title of my blog”;

$title = preg_replace(“/(^(\w*?)|(\w{4,}?))/e”, “ucfirst(‘$1′)”, $string);

Comments on this entry are closed.