Search when iconv fails

Why was iconv being so troublesome?
Some characters are 7-bit, some 8-bit, others two bytes or more.
iconv did OK with converting some characters, but not with others.

Using mb_convert_encoding had better results than attempts to use iconv had produced.
It converted “Forlì” to “Forli” and “Málaga” to “Malaga”
But it failed with “Đà Nẵng” and “Muḩāfaz̧at al Ḩudaydah”

It became clear that I was trying to push iconv and mb_convert_encoding to do more than they were intended to do when “hack” kludges began to creep into the code.

One common solution is to have a “replacement” array of characters. This may work for smaller numbers of characters, but I for one would prefer to not do the tedious work involved with manually mapping out replacements for a large number of possible characters.

The Transliterator class to the rescue.

It took a while to determine what “from” and “to” to use of the 294 possible, (in this case, for my somewhat limited 575 rows it was “Latin-ASCII”) but once found, success !

Then a couple of ALTER TABLE queries to add “transliteration” fields and an INSERT INTO query and on to the next problem on the list, dealing with misspelled and partial input entries.

<?php
declare(strict_types=1);
error_reporting(E_ALL);
ini_set('display_errors', 'true');

$transliterator_id_select = "";
$input_text = "";
$output_text = "";

function transliterate_text(string $transliterator_id, string $text_string): string {
  $transliterated_text = transliterator_transliterate($transliterator_id, $text_string);
  return $transliterated_text;
}

$transliterator_list_ids = transliterator_list_ids();
sort($transliterator_list_ids);
if ($_POST) {
  if ($_POST['transliterator_id_select']) {
    $transliterator_id_select = $_POST['transliterator_id_select'];
  }
  if ($_POST['input_text']) {
    $input_text = $_POST['input_text'];
  }
  $output_text = transliterate_text($transliterator_id_select, $input_text);
}
?>
<!DOCTYPE HTML>
<html lang="en">
<head>
<title>Intl Transliteration</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width,initial-scale=1" />
<style type="text/css">
#display_div {
  border: 3px double #444;
  padding: 0.5em 1em;
  overflow: auto;
}
</style>
<script type="text/javascript">
// script needed before the DOM is loaded here
</script>
</head>
<body>
<h1>Intl Transliteration</h1>
<form action="#" method="post">
  <fieldset>
    <legend>Intl Transliteration</legend>
	
	<label for="transliterator_id_select">Start-End</label>
	<select name="transliterator_id_select" id="transliterator_id_select">
<?php 
foreach ($transliterator_list_ids as $transliterator_list_id) {
  if ($transliterator_id_select === $transliterator_list_id) {
    echo '<option value="' . $transliterator_list_id   . '" selected="selected">' . $transliterator_list_id . '</option>';
  } else if ( (!$_POST) && ($transliterator_list_id == "Latin-ASCII") ) {
    echo '<option value="' . $transliterator_list_id   . '" selected="selected">' . $transliterator_list_id . '</option>';
  } else {
    echo '<option value="' . $transliterator_list_id   . '">' . $transliterator_list_id . '</option>';
  }
}
?>
    </select>

	<textarea name="input_text" rows="10" cols="50"><?php  echo $input_text; ?></textarea>

	<input type="submit" value="Transliterate" />
	
  </fieldset>
</form>
  <div><?php  echo $transliterator_id_select; ?></div>
  <div id="display_div" rows="10" cols="50"><pre><?php  echo $output_text; ?></pre></div>
<script type="text/javascript">
// script that needs the DOM to be loaded here
</script>
</body>
</html>