<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: US-ASCII transliterations of Unicode text</title>
	<atom:link href="http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/</link>
	<description></description>
	<pubDate>Sun, 07 Sep 2008 20:52:42 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: Anonymous</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-687892</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Fri, 11 Apr 2008 16:44:20 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-687892</guid>
		<description>&lt;code&gt;&lt;/code&gt;&lt;code&gt;&lt;/code&gt;&lt;code&gt;&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p><code></code><code></code><code></code></p>]]></content:encoded>
	</item>
	<item>
		<title>By: dusoft</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15112</link>
		<dc:creator>dusoft</dc:creator>
		<pubDate>Sun, 05 Mar 2006 22:41:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15112</guid>
		<description>I don't speak Ukrainian, but could understand some. The transcribed cyrillics looks OK, although I would like to hear Ukrainian on this.</description>
		<content:encoded><![CDATA[<p>I don&#8217;t speak Ukrainian, but could understand some. The transcribed cyrillics looks OK, although I would like to hear Ukrainian on this.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: HarryF</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15099</link>
		<dc:creator>HarryF</dc:creator>
		<pubDate>Sun, 05 Mar 2006 18:28:48 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15099</guid>
		<description>&lt;blockquote&gt;
I think it’s not good, not even a single word :) It’s not correct to map a single letter in arabic to a letter in english, it’s quite hard to explain. but for example the sentence you provided could be something like this:
“Ana qader ala akel alzojaj wa hatha la yo’limony”

As you can see the sound of a basic letter is almost always accompanied with vowel. 
&lt;/blockquote&gt;

That's what I'd feared. For languages with a closer relationship to the Roman alphabet, seems to do a good job. Sean notes the limitations &lt;a href="http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm#DESIGN_GOALS_AND_CONSTRAINTS" rel="nofollow"&gt;here&lt;/a&gt;;

&lt;blockquote&gt;
Text::Unidecode is meant to be a transliterator-of-last resort, to be used once you've decided that you can't just display the Unicode data as is, and once you've decided you don't have a more clever, language-specific transliterator available. It transliterates context-insensitively -- that is, a given character is replaced with the same US-ASCII (7-bit ASCII) character or characters, no matter what the surrounding character are.
&lt;/blockquote&gt;</description>
		<content:encoded><![CDATA[<blockquote><p>
I think it’s not good, not even a single word :) It’s not correct to map a single letter in arabic to a letter in english, it’s quite hard to explain. but for example the sentence you provided could be something like this:<br />
“Ana qader ala akel alzojaj wa hatha la yo’limony”</p>
<p>As you can see the sound of a basic letter is almost always accompanied with vowel.
</p></blockquote>
<p>That&#8217;s what I&#8217;d feared. For languages with a closer relationship to the Roman alphabet, seems to do a good job. Sean notes the limitations <a href="http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm#DESIGN_GOALS_AND_CONSTRAINTS" rel="nofollow">here</a>;</p>
<blockquote><p>
Text::Unidecode is meant to be a transliterator-of-last resort, to be used once you&#8217;ve decided that you can&#8217;t just display the Unicode data as is, and once you&#8217;ve decided you don&#8217;t have a more clever, language-specific transliterator available. It transliterates context-insensitively &#8212; that is, a given character is replaced with the same US-ASCII (7-bit ASCII) character or characters, no matter what the surrounding character are.
</p></blockquote>]]></content:encoded>
	</item>
	<item>
		<title>By: HarryF</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15085</link>
		<dc:creator>HarryF</dc:creator>
		<pubDate>Sun, 05 Mar 2006 14:29:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15085</guid>
		<description>&lt;blockquote&gt;
and oh do you know what the sentence means, it’s weird it means “I can eat glass, and that doesn’t hurt me” wondering where you got that from
&lt;/blockquote&gt;

Hmmm - not such a nice thing to teach beginners. It comes from here: http://www.columbia.edu/kermit/utf8.html</description>
		<content:encoded><![CDATA[<blockquote><p>
and oh do you know what the sentence means, it’s weird it means “I can eat glass, and that doesn’t hurt me” wondering where you got that from
</p></blockquote>
<p>Hmmm - not such a nice thing to teach beginners. It comes from here: <a href="http://www.columbia.edu/kermit/utf8.html" rel="nofollow">http://www.columbia.edu/kermit/utf8.html</a></p>]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15065</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Sun, 05 Mar 2006 04:33:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15065</guid>
		<description>Great stuff Harry. I can use this on a little project of mine :)</description>
		<content:encoded><![CDATA[<p>Great stuff Harry. I can use this on a little project of mine :)</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Ammar Ibrahim</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15062</link>
		<dc:creator>Ammar Ibrahim</dc:creator>
		<pubDate>Sun, 05 Mar 2006 03:08:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15062</guid>
		<description>and oh do you know what the sentence means, it's weird it means "I can eat glass, and that doesn't hurt me" wondering where you got that from</description>
		<content:encoded><![CDATA[<p>and oh do you know what the sentence means, it&#8217;s weird it means &#8220;I can eat glass, and that doesn&#8217;t hurt me&#8221; wondering where you got that from</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Ammar Ibrahim</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15055</link>
		<dc:creator>Ammar Ibrahim</dc:creator>
		<pubDate>Sat, 04 Mar 2006 23:45:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15055</guid>
		<description>&lt;blockquote&gt;
Before: *Arabic*: أنا قادر على أكل الزجاج و هذا لا يؤلمني.
After: *Arabic*: ‘n qdr `l~ ‘kl lzjj w hdh l yw’lmny.
&lt;/blockquote&gt;

I think it's not good, not even a single word :) It's not correct to map a single letter in arabic to a letter in english, it's quite hard to explain. but for example the sentence you provided could be something like this:
"Ana qader ala akel alzojaj wa hatha la yo'limony"

As you can see the sound of a basic letter is almost always accompanied with vowel.</description>
		<content:encoded><![CDATA[<blockquote><p>
Before: *Arabic*: أنا قادر على أكل الزجاج و هذا لا يؤلمني.<br />
After: *Arabic*: ‘n qdr `l~ ‘kl lzjj w hdh l yw’lmny.
</p></blockquote>
<p>I think it&#8217;s not good, not even a single word :) It&#8217;s not correct to map a single letter in arabic to a letter in english, it&#8217;s quite hard to explain. but for example the sentence you provided could be something like this:<br />
&#8220;Ana qader ala akel alzojaj wa hatha la yo&#8217;limony&#8221;</p>
<p>As you can see the sound of a basic letter is almost always accompanied with vowel.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: HarryF</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15048</link>
		<dc:creator>HarryF</dc:creator>
		<pubDate>Sat, 04 Mar 2006 22:07:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15048</guid>
		<description>OK - an Arabic example (you can see these for yourself BTW - extract the download somewhere to your webserver and point your browser at the "test" subdirectory).


Before: *Arabic*: أنا قادر على أكل الزجاج و هذا لا يؤلمني.
After: *Arabic*: 'n qdr `l~ 'kl lzjj w hdh l yw'lmny.

Would be interested on your opinion of how good that is.

&lt;blockquote&gt;
For people who are not aware of this, there is also a (much faster) PHP extension for this already:
http://derickrethans.nl/translit.php
&lt;/blockquote&gt;

Derick - have you thought of doing a pure PHP interface to your database files? Think a problem for many is hosts where they can't install new extensions (plus for those writing PHP apps for mass deployment, extension dependencies tends to be a problem).</description>
		<content:encoded><![CDATA[<p>OK - an Arabic example (you can see these for yourself BTW - extract the download somewhere to your webserver and point your browser at the &#8220;test&#8221; subdirectory).</p>
<p>Before: *Arabic*: أنا قادر على أكل الزجاج و هذا لا يؤلمني.<br />
After: *Arabic*: &#8216;n qdr `l~ &#8216;kl lzjj w hdh l yw&#8217;lmny.</p>
<p>Would be interested on your opinion of how good that is.</p>
<blockquote><p>
For people who are not aware of this, there is also a (much faster) PHP extension for this already:<br />
<a href="http://derickrethans.nl/translit.php" rel="nofollow">http://derickrethans.nl/translit.php</a>
</p></blockquote>
<p>Derick - have you thought of doing a pure PHP interface to your database files? Think a problem for many is hosts where they can&#8217;t install new extensions (plus for those writing PHP apps for mass deployment, extension dependencies tends to be a problem).</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Markus Wolff</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15040</link>
		<dc:creator>Markus Wolff</dc:creator>
		<pubDate>Sat, 04 Mar 2006 18:41:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15040</guid>
		<description>Cool, should've had that a few weeks before (or simply remembered the translit extension...grrr :-)), whem I wrote a really ugly function for Postgres to translate German Umlauts to plain ASCII for use with Levenshtein and the likes... ah well, time to redo things again ;-)</description>
		<content:encoded><![CDATA[<p>Cool, should&#8217;ve had that a few weeks before (or simply remembered the translit extension&#8230;grrr :-)), whem I wrote a really ugly function for Postgres to translate German Umlauts to plain ASCII for use with Levenshtein and the likes&#8230; ah well, time to redo things again ;-)</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Ammar Ibrahim</title>
		<link>http://www.sitepoint.com/blogs/2006/03/03/us-ascii-transliterations-of-unicode-text/#comment-15033</link>
		<dc:creator>Ammar Ibrahim</dc:creator>
		<pubDate>Sat, 04 Mar 2006 16:54:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1450#comment-15033</guid>
		<description>I'm an Arabic speaker Harry, I'd help you if that's possible</description>
		<content:encoded><![CDATA[<p>I&#8217;m an Arabic speaker Harry, I&#8217;d help you if that&#8217;s possible</p>]]></content:encoded>
	</item>
</channel>
</rss>
