Quick one – knocked up a list of “dangerous” functions and functionality in PHP, in relation to the use of UTF-8, available at http://www.phpwact.org/php/i18n/utf-8. These are for a “default” PHP setup without the mbstring overloading or PHP6 (where charset problems “magically vanish” ;)).
This follows on from (unfinished) stuff here on charsets (tending towards UTF-8), which should help explain some of this.
Should point out this is coming from the angle “you can’t rely on the mbstring extension being available”, which is often the case with shared hosts – how to you deal with i18n in such environments. The counter view is here.
Anyway – hopefully useful as a starting point for analysing PHP code bases when considering UTF-8 (with a little help from phpxref perhaps). If you want to change / add stuff, the wiki requires a login which you can get here
Related posts:
- How to Install PHP on Windows In his final installation tutorial, Craig provides a step-by-step guide...
- Introducing php-tracer-weaver php-tracer-weaver is a tool for automatically generating docblock comments, with...
- Installing PHP on Windows Just Got Easier Have you ever felt frustrated when setting up a PHP/MySQL...
- Free PHP Webinar: How to Increase Performance with Caching Zend are running a free webinar today, with a live...
- Are PHP Namespaces Really So Bad? Namespaces have caused a divide amongst developers who either love...







Thanks Harry for this immensely useful summary. Had quite a few problems with the PCRE extension myself – now I know how to deal with them.
December 7th, 2005 at 12:35 am
Harry, thanks very much! Useful post, you’re helped me:)
December 7th, 2005 at 3:54 am