Skip to main content

PHP MapReduce

By Harry Fuecks



Free JavaScript Book!

Write powerful, clean and maintainable JavaScript.

RRP $11.95

So after my initial concern over the impact of this, figured it out at last – what Google is trying to tell us – we’ve got a huge cluster right there at our disposal!

So spent the night hacking together PHP MapReduce – the master node, which you run on your server, uses this search to locate victims… errr … workers to participate in the cluster. You then write some code like;

require_once 'mapreduce.php';

$veryLargeFile = '/tmp/bigfile';
$map = '';
$reduce = '';

# Massively distributed computing, here we come...
$result = MapReduce($veryLargeFile, $map, $reduce);

At the moment it’s limited PHP-only execution on the workers, so that’s a fairly limited size cluster. But working on extending it so that your map and reduce functions are automatically translated into MySQL stored procedures, allowing this search to significantly expand the cluster (thanks Ilia). And with help from adodb I think it should be possible to make this DB independent.

But where this get’s really interesting is considering this search. Now this is a lot harder to implement but it should be possible to invite browsers to join the cluster as well, dramatically increasing your processing power. The workflow would be something like master => worker server => worker browser => (via AJAX back to) => work server => master.

We’ve entered the real age of distributed computing folks. Think of the wonderful things we could do with this, such as the biggest blog spam filter ever!

This is a JOKE btw!

…and probably a bad one. It’s not April but anyway. And I’m not working on this. And I never will be.

Think it might be a good idea for Google to allow people to restrict the search to a single domain, so people can at least see what’s in their on their own site and clean up as needed.

Harry Fuecks is the Engineering Project Lead at Tamedia and formerly the Head of Engineering at Squirro. He is a data-driven facilitator, leader, coach and specializes in line management, hiring software engineers, analytics, mobile, and marketing. Harry also enjoys writing and you can read his articles on SitePoint and Medium.

New books out now!

Give yourself more options and write higher quality CSS with CSS Optimization Basics.