Blog Post RSS ?

Blogs » PHP » PHP CLI and Google Translation
 

PHP CLI and Google Translation

by Harry Fuecks

Just been doing some heavy translation from German to English. Which lead to writing a simple command line tool for translating via Google - waste time to save time in other words. Getting a prototype up from scratch took about 15 minutes (which sheds a light on blog spam I guess). Have cleaned it up a bit and dropping it here in case anyone can use it.

Two major things that still need doing;

- validation of the language pair CLI option, to keep users sane

- character set conversion - right now getting UFT8 from Google so XML_HTMLSax can parse correctly (it relies on the str_ functions) but not doing anything clever after that.

…and there’s probably a bug or three.

#!/usr/local/bin/php http://www.php.net/license/3_0.txt */ //------------------------------------------------------------------------------ require_once 'Console/Getopt.php'; require_once 'HTTP/Request.php'; require_once 'XML/HTMLSax3.php'; //------------------------------------------------------------------------------ class Google_Translate { var $langpair; var $proxy = null; function Google_Translate($langpair, $proxy = null) { $this->langpair = $langpair; $this->proxy = $proxy; } function query($text) { $R = & new HTTP_Request('http://translate.google.com/translate_t'); $R->setMethod('POST'); if ( !is_null($this->proxy) ) { $pxy = parse_url($this->proxy); $R->setProxy($pxy['host'],$pxy['port'],$pxy['user'],$pxy['pass']); } $R->addPostData('text', utf8_encode($text)); $R->addPostData('langpair', $this->langpair); $R->addPostData('ie','UTF8'); $R->addPostData('oe','UTF8'); $res = $R->sendRequest(); if ( PEAR::isError($res) ) { fwrite(STDERR, "Connection problem: ".$res->toString()."\n"); exit(1); } if ($R->getResponseCode() != '200') { fwrite(STDOUT, "Invalid HTTP Status: ".$R->getResponseCode()."\n"); exit(1); } return $R->getResponseBody(); } function translate($text) { $P = & new Google_Translate_Parser(); return $P->parse($this->query($text)); } } //------------------------------------------------------------------------------ class Google_Translate_Parser { var $inResult = FALSE; var $result = ''; function open($p, $tag, $attrs) { if ( $tag == 'textarea' && isset($attrs['name']) && $attrs['name'] == 'q' ) { $this->inResult = TRUE; } } function close($p, $tag) { if ( $this->inResult && $tag == 'textarea' ) { $this->inResult = FALSE; } } function data($p, $data) { if ( $this->inResult ) { $this->result .= $data; } } function parse($html) { $P = & new XML_HTMLSax3(); $P->set_object($this); $P->set_element_handler('open','close'); $P->set_data_handler('data'); $P->parse($html); return utf8_decode($this->result); } } //------------------------------------------------------------------------------ function usage() { $usage = <<getMessage()."\n"); exit(1); } if ( realpath($_SERVER['argv'][0]) == __FILE__ ) { $options = Console_Getopt::getOpt($args,'hl:p:'); } else { $options = Console_Getopt::getOpt2($args,'hl:p:'); } if ( PEAR::isError($options) ) { fwrite(STDERR,$options->getMessage()."\n"); usage(); } $proxy = null; foreach ( $options[0] as $option ) { switch ( $option[0] ) { case 'h': usage(); break; case 'l': $lang = str_replace('-','|',$option[1]); break; case 'p': $proxy = $option[1]; break; } } $G = & new Google_Translate($lang, $proxy); echo $G->translate($options[1][0])."\n"; exit(0);

This post has 5 responses so far

  1. Nice work.

    Not to sound picky or anything, your link is ww.php.net, just missing a double you.

    But yes nice work. Could come in handy.

     
  2. Nice work. I had to do something similar a few years ago. I had all my language stuff in define()’s in a set PHP file so I found the easiest way was to just read all the constants from that file and echo them in a

    block with the constant name in comments so …

    define(LANG_HELLO, ‘Hello there’);

    became …

    Hello there

    I could then use CURL to send the stuff to google and read it back. Because of the very simple HTML format I could get away hacking some Regular expressions to put the PHP file back :-)

    Those where the days when my programming knowledge was a bit naff.

     
  3. I did a similar thing with Java.. it is embeddable in applications (ie. jEdit via macros). It works with all of the Google translations as we as excite.

    http://mills.zapto.org/projects/translate

     
  4. There are a number of important considerations when developing a shrink-wrapped software package. Perhaps you are planning to deliver your application on installation media (CD/DVD) or via your website as a download. If so, you need to consider:

     
  5. Nice work.

     

Sponsored Links

Leave a response

You are not logged in, log in with your SitePoint Forum username and password.

-OR- Post Anonymously

* Make sure any code samples are escaped (i.e. ‘<b>’ becomes ‘&lt;b&gt;’).

If not logged in, your comments will be placed in a moderation queue. This means your comment may not appear until one of our moderators approves it.

SitePoint Marketplace

Buy and sell Websites, templates, domain names, hosting, graphics and more.