PHP CLI and Google Translation

Tweet

Just been doing some heavy translation from German to English. Which lead to writing a simple command line tool for translating via Google – waste time to save time in other words. Getting a prototype up from scratch took about 15 minutes (which sheds a light on blog spam I guess). Have cleaned it up a bit and dropping it here in case anyone can use it.

Two major things that still need doing;

- validation of the language pair CLI option, to keep users sane

- character set conversion – right now getting UFT8 from Google so XML_HTMLSax can parse correctly (it relies on the str_ functions) but not doing anything clever after that.

…and there’s probably a bug or three.


#!/usr/local/bin/php
/**
* License: http://www.php.net/license/3_0.txt
*/
//------------------------------------------------------------------------------
require_once 'Console/Getopt.php';
require_once 'HTTP/Request.php';
require_once 'XML/HTMLSax3.php';

//------------------------------------------------------------------------------
class Google_Translate {

var $langpair;
var $proxy = null;

function Google_Translate($langpair, $proxy = null) {

$this->langpair = $langpair;
$this->proxy = $proxy;

}

function query($text) {
$R = & new HTTP_Request('http://translate.google.com/translate_t');
$R->setMethod('POST');

if ( !is_null($this->proxy) ) {
$pxy = parse_url($this->proxy);
$R->setProxy($pxy['host'],$pxy['port'],$pxy['user'],$pxy['pass']);
}

$R->addPostData('text', utf8_encode($text));
$R->addPostData('langpair', $this->langpair);
$R->addPostData('ie','UTF8');
$R->addPostData('oe','UTF8');

$res = $R->sendRequest();

if ( PEAR::isError($res) ) {
fwrite(STDERR, "Connection problem: ".$res->toString()."n");
exit(1);
}

if ($R->getResponseCode() != '200') {
fwrite(STDOUT, "Invalid HTTP Status: ".$R->getResponseCode()."n");
exit(1);
}

return $R->getResponseBody();
}

function translate($text) {
$P = & new Google_Translate_Parser();
return $P->parse($this->query($text));
}
}

//------------------------------------------------------------------------------
class Google_Translate_Parser {

var $inResult = FALSE;
var $result = '';

function open($p, $tag, $attrs) {
if ( $tag == 'textarea' && isset($attrs['name']) && $attrs['name'] == 'q' ) {
$this->inResult = TRUE;
}
}

function close($p, $tag) {
if ( $this->inResult && $tag == 'textarea' ) {
$this->inResult = FALSE;
}
}

function data($p, $data) {
if ( $this->inResult ) {
$this->result .= $data;
}
}

function parse($html) {

$P = & new XML_HTMLSax3();
$P->set_object($this);
$P->set_element_handler('open','close');
$P->set_data_handler('data');
$P->parse($html);

return utf8_decode($this->result);
}
}

//------------------------------------------------------------------------------
function usage() {
$usage = << Usage: ./gtrans.php [ OPTIONS ] [ Text to tranlate ]

Translates input string via Google's language tools.

-l=LANGPAIR Language to-from e.g. de-en
-p=PROXY Proxy server URL (optional)
-h Display usage

EOD;
fwrite(STDOUT,$usage);
exit(0);
}

//------------------------------------------------------------------------------
$args = Console_Getopt::readPHPArgv();

if ( PEAR::isError($args) ) {
fwrite(STDERR,$args->getMessage()."n");
exit(1);
}

if ( realpath($_SERVER['argv'][0]) == __FILE__ ) {
$options = Console_Getopt::getOpt($args,'hl:p:');
} else {
$options = Console_Getopt::getOpt2($args,'hl:p:');
}

if ( PEAR::isError($options) ) {
fwrite(STDERR,$options->getMessage()."n");
usage();
}

$proxy = null;

foreach ( $options[0] as $option ) {
switch ( $option[0] ) {
case 'h':
usage();
break;

case 'l':
$lang = str_replace('-','|',$option[1]);
break;

case 'p':
$proxy = $option[1];
break;
}
}

$G = & new Google_Translate($lang, $proxy);
echo $G->translate($options[1][0])."n";
exit(0);

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • ausurt

    Nice work.

    Not to sound picky or anything, your link is ww.php.net, just missing a double you.

    But yes nice work. Could come in handy.

  • MiiJaySung

    Nice work. I had to do something similar a few years ago. I had all my language stuff in define()’s in a set PHP file so I found the easiest way was to just read all the constants from that file and echo them in a

    block with the constant name in comments so …

    define(LANG_HELLO, ‘Hello there’);

    became …

    Hello there

    I could then use CURL to send the stuff to google and read it back. Because of the very simple HTML format I could get away hacking some Regular expressions to put the PHP file back :-)

    Those where the days when my programming knowledge was a bit naff.

  • Ryan

    I did a similar thing with Java.. it is embeddable in applications (ie. jEdit via macros). It works with all of the Google translations as we as excite.

    http://mills.zapto.org/projects/translate

  • eee

    There are a number of important considerations when developing a shrink-wrapped software package. Perhaps you are planning to deliver your application on installation media (CD/DVD) or via your website as a download. If so, you need to consider:

  • Ivan

    Nice work.