PHP CLI and Google Translation

By | | Programming

5

Just been doing some heavy translation from German to English. Which lead to writing a simple command line tool for translating via Google – waste time to save time in other words. Getting a prototype up from scratch took about 15 minutes (which sheds a light on blog spam I guess). Have cleaned it up a bit and dropping it here in case anyone can use it.

Two major things that still need doing;

- validation of the language pair CLI option, to keep users sane

- character set conversion – right now getting UFT8 from Google so XML_HTMLSax can parse correctly (it relies on the str_ functions) but not doing anything clever after that.

…and there’s probably a bug or three.


#!/usr/local/bin/php
/**
* License: http://www.php.net/license/3_0.txt
*/
//------------------------------------------------------------------------------
require_once 'Console/Getopt.php';
require_once 'HTTP/Request.php';
require_once 'XML/HTMLSax3.php';

//------------------------------------------------------------------------------
class Google_Translate {

var $langpair;
var $proxy = null;

function Google_Translate($langpair, $proxy = null) {

$this->langpair = $langpair;
$this->proxy = $proxy;

}

function query($text) {
$R = & new HTTP_Request('http://translate.google.com/translate_t');
$R->setMethod('POST');

if ( !is_null($this->proxy) ) {
$pxy = parse_url($this->proxy);
$R->setProxy($pxy['host'],$pxy['port'],$pxy['user'],$pxy['pass']);
}

$R->addPostData('text', utf8_encode($text));
$R->addPostData('langpair', $this->langpair);
$R->addPostData('ie','UTF8');
$R->addPostData('oe','UTF8');

$res = $R->sendRequest();

if ( PEAR::isError($res) ) {
fwrite(STDERR, "Connection problem: ".$res->toString()."\n");
exit(1);
}

if ($R->getResponseCode() != '200') {
fwrite(STDOUT, "Invalid HTTP Status: ".$R->getResponseCode()."\n");
exit(1);
}

return $R->getResponseBody();
}

function translate($text) {
$P = & new Google_Translate_Parser();
return $P->parse($this->query($text));
}
}

//------------------------------------------------------------------------------
class Google_Translate_Parser {

var $inResult = FALSE;
var $result = '';

function open($p, $tag, $attrs) {
if ( $tag == 'textarea' && isset($attrs['name']) && $attrs['name'] == 'q' ) {
$this->inResult = TRUE;
}
}

function close($p, $tag) {
if ( $this->inResult && $tag == 'textarea' ) {
$this->inResult = FALSE;
}
}

function data($p, $data) {
if ( $this->inResult ) {
$this->result .= $data;
}
}

function parse($html) {

$P = & new XML_HTMLSax3();
$P->set_object($this);
$P->set_element_handler('open','close');
$P->set_data_handler('data');
$P->parse($html);

return utf8_decode($this->result);
}
}

//------------------------------------------------------------------------------
function usage() {
$usage = << Usage: ./gtrans.php [ OPTIONS ] [ Text to tranlate ]

Translates input string via Google's language tools.

-l=LANGPAIR Language to-from e.g. de-en
-p=PROXY Proxy server URL (optional)
-h Display usage

EOD;
fwrite(STDOUT,$usage);
exit(0);
}

//------------------------------------------------------------------------------
$args = Console_Getopt::readPHPArgv();

if ( PEAR::isError($args) ) {
fwrite(STDERR,$args->getMessage()."\n");
exit(1);
}

if ( realpath($_SERVER['argv'][0]) == __FILE__ ) {
$options = Console_Getopt::getOpt($args,'hl:p:');
} else {
$options = Console_Getopt::getOpt2($args,'hl:p:');
}

if ( PEAR::isError($options) ) {
fwrite(STDERR,$options->getMessage()."\n");
usage();
}

$proxy = null;

foreach ( $options[0] as $option ) {
switch ( $option[0] ) {
case 'h':
usage();
break;

case 'l':
$lang = str_replace('-','|',$option[1]);
break;

case 'p':
$proxy = $option[1];
break;
}
}

$G = & new Google_Translate($lang, $proxy);
echo $G->translate($options[1][0])."\n";
exit(0);

Get Started with
Ruby on Rails

Github, Twitter and Hulu. All huge. All successful. All Rails.

Learn the web development framework of the moment with our newest book and course.

Learn Rails

{ 5 comments }

Ivan January 23, 2007 at 3:52 am

Nice work.

eee May 9, 2006 at 3:34 am

There are a number of important considerations when developing a shrink-wrapped software package. Perhaps you are planning to deliver your application on installation media (CD/DVD) or via your website as a download. If so, you need to consider:

Ryan December 9, 2005 at 3:47 am

I did a similar thing with Java.. it is embeddable in applications (ie. jEdit via macros). It works with all of the Google translations as we as excite.

http://mills.zapto.org/projects/translate

MiiJaySung September 19, 2004 at 7:13 am

Nice work. I had to do something similar a few years ago. I had all my language stuff in define()’s in a set PHP file so I found the easiest way was to just read all the constants from that file and echo them in a

block with the constant name in comments so …

define(LANG_HELLO, ‘Hello there’);

became …

Hello there

I could then use CURL to send the stuff to google and read it back. Because of the very simple HTML format I could get away hacking some Regular expressions to put the PHP file back :-)

Those where the days when my programming knowledge was a bit naff.

ausurt September 18, 2004 at 4:04 am

Nice work.

Not to sound picky or anything, your link is ww.php.net, just missing a double you.

But yes nice work. Could come in handy.

Comments on this entry are closed.