<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Web bugs for job scheduling: hack or solution?</title>
	<atom:link href="http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/</link>
	<description></description>
	<pubDate>Sat, 11 Oct 2008 00:51:28 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5</generator>
		<item>
		<title>By: PHPit - Totally PHP &#187; Creating a “Who’s Online” script with PHP</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-11800</link>
		<dc:creator>PHPit - Totally PHP &#187; Creating a “Who’s Online” script with PHP</dc:creator>
		<pubDate>Sun, 11 Dec 2005 22:48:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-11800</guid>
		<description>[...] Our script will store a visitor's IP address and last active timestamp in a database, and update it every time the visitor requests one of our pages. The script will actually be a web bug, and run completely in the background. If you want to know more about web bugs, have a look at this SitePoint Blog entry by Harry Fuecks, but in a nut shell it's a small 1x1 image that actually executes PHP, and returns an image. So all our pages will point to a PHP file, like so: &#60;img src="whosonline.php" alt="" /&#62; [...]</description>
		<content:encoded><![CDATA[<p>[&#8230;] Our script will store a visitor&#8217;s IP address and last active timestamp in a database, and update it every time the visitor requests one of our pages. The script will actually be a web bug, and run completely in the background. If you want to know more about web bugs, have a look at this SitePoint Blog entry by Harry Fuecks, but in a nut shell it&#8217;s a small 1&#215;1 image that actually executes PHP, and returns an image. So all our pages will point to a PHP file, like so: &lt;img src=&#8221;whosonline.php&#8221; alt=&#8221;" /&gt; [&#8230;]</p>]]></content:encoded>
	</item>
	<item>
		<title>By: fryk</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-11299</link>
		<dc:creator>fryk</dc:creator>
		<pubDate>Wed, 30 Nov 2005 11:30:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-11299</guid>
		<description>Great article. But it doesn't work for redirecs. Could anyone know how to make a redirect and continue script execution?

Below is not working. It makes redirect after 3 seconds:

&lt;code&gt;ob_implicit_flush(TRUE); 
@ignore_user_abort(true); 

header("HTTP/1.1 301 Moved Permanently");
header('Content-Length: 0'); 
header('Location: http://example.com'); 
header('Connection: Close'); 

ob_implicit_flush(FALSE); 
ob_start();
		
sleep(3);

// some code

&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p>Great article. But it doesn&#8217;t work for redirecs. Could anyone know how to make a redirect and continue script execution?</p>
<p>Below is not working. It makes redirect after 3 seconds:</p>
<code>ob_implicit_flush(TRUE); 
@ignore_user_abort(true); 

header("HTTP/1.1 301 Moved Permanently");
header('Content-Length: 0'); 
header('Location: <a href="http://example.com" rel="nofollow">http://example.com</a>'); 
header('Connection: Close'); 

ob_implicit_flush(FALSE); 
ob_start();
		
sleep(3);

// some code

</code>]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10971</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Sun, 20 Nov 2005 12:58:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10971</guid>
		<description>I would like to recommend you the most fair and authoritative top 10 web hosting list all over the world. 
The to 10 web hosting in November
#1.LunarPages
#2.iPowerWeb
#3.PowWeb
#4.midPhase
#5.Startlogic
#6.Globat
#7.hostony
#8.EasyCGI
#9.dot5
#10.WebsiteSource
&lt;a href="http://www.t10host.com" rel="nofollow"&gt;See www.t10host.com for more information&lt;/a&gt;</description>
		<content:encoded><![CDATA[<p>I would like to recommend you the most fair and authoritative top 10 web hosting list all over the world.<br />
The to 10 web hosting in November<br />
#1.LunarPages<br />
#2.iPowerWeb<br />
#3.PowWeb<br />
#4.midPhase<br />
#5.Startlogic<br />
#6.Globat<br />
#7.hostony<br />
#8.EasyCGI<br />
#9.dot5<br />
#10.WebsiteSource<br />
<a href="http://www.t10host.com" rel="nofollow">See </a><a href="http://www.t10host.com" rel="nofollow">http://www.t10host.com</a> for more information</p>]]></content:encoded>
	</item>
	<item>
		<title>By: dumky</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10571</link>
		<dc:creator>dumky</dc:creator>
		<pubDate>Mon, 07 Nov 2005 23:01:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10571</guid>
		<description>Maarten, using PHP as a CLI scripting language for cron jobs does sound like it would help unifying the development environment. 

But cron job run in a separate process. That means that any result from the job won't be directly available in memory for new requests being handled, you have to come up with some kind of inter-process communication solution be it the filesystem or other...
Also, scheduled jobs create an additional deployment requirement.

One specific scenario where I would have needed a good solution for background threads was generating and refreshing a cache of CAPTCHA images. The cache is in-memory to allow more throughput. Using a background thread to generate new images, without saving them to file, allows for a more integrated solution (less boundaries, no need for ACLing directories).</description>
		<content:encoded><![CDATA[<p>Maarten, using PHP as a CLI scripting language for cron jobs does sound like it would help unifying the development environment. </p>
<p>But cron job run in a separate process. That means that any result from the job won&#8217;t be directly available in memory for new requests being handled, you have to come up with some kind of inter-process communication solution be it the filesystem or other&#8230;<br />
Also, scheduled jobs create an additional deployment requirement.</p>
<p>One specific scenario where I would have needed a good solution for background threads was generating and refreshing a cache of CAPTCHA images. The cache is in-memory to allow more throughput. Using a background thread to generate new images, without saving them to file, allows for a more integrated solution (less boundaries, no need for ACLing directories).</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Maarten Manders</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10564</link>
		<dc:creator>Maarten Manders</dc:creator>
		<pubDate>Mon, 07 Nov 2005 18:01:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10564</guid>
		<description>Dumky, you can use cron jobs that execute PHP command line interface (CLI) scripts. It's a common way to solve those problems.</description>
		<content:encoded><![CDATA[<p>Dumky, you can use cron jobs that execute PHP command line interface (CLI) scripts. It&#8217;s a common way to solve those problems.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: dumky</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10563</link>
		<dc:creator>dumky</dc:creator>
		<pubDate>Mon, 07 Nov 2005 17:38:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10563</guid>
		<description>I've needed to have a background thread running in my web apps a number of times. Why not support this functionality in the web server?</description>
		<content:encoded><![CDATA[<p>I&#8217;ve needed to have a background thread running in my web apps a number of times. Why not support this functionality in the web server?</p>]]></content:encoded>
	</item>
	<item>
		<title>By: Maarten Manders</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10546</link>
		<dc:creator>Maarten Manders</dc:creator>
		<pubDate>Mon, 07 Nov 2005 08:33:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10546</guid>
		<description>Great article, welcome back Harry!</description>
		<content:encoded><![CDATA[<p>Great article, welcome back Harry!</p>]]></content:encoded>
	</item>
	<item>
		<title>By: HarryF</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10491</link>
		<dc:creator>HarryF</dc:creator>
		<pubDate>Fri, 04 Nov 2005 14:01:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10491</guid>
		<description>&lt;blockquote&gt;
So we are comparing apples to oranges if what you are after is a system that responds to events rather than set times.
&lt;/blockquote&gt;

Agreed.

&lt;blockquote&gt;
Why wouldn’t the indexer be triggered when the process of saving the updates to the wiki occurs?
&lt;/blockquote&gt;

Technically what you're suggesting is doable - in the script that accepts the update, you could hang up the browser in the same way as above then start indexing.

But think the main thing here is whether updates then become the &lt;em&gt;only&lt;/em&gt; way the indexes are refreshed. Andi has employed the simplest solution to avoid race conditions, with the rule that only one indexer may run at a time. But what it someone updates a page while the indexer is already running? Then you need some other mechanism to refresh the index later. And you also want to be able to reindex in case of corruption / data loss. Think the web bug approach makes the solution alot simpler.</description>
		<content:encoded><![CDATA[<blockquote><p>
So we are comparing apples to oranges if what you are after is a system that responds to events rather than set times.
</p></blockquote>
<p>Agreed.</p>
<blockquote><p>
Why wouldn’t the indexer be triggered when the process of saving the updates to the wiki occurs?
</p></blockquote>
<p>Technically what you&#8217;re suggesting is doable - in the script that accepts the update, you could hang up the browser in the same way as above then start indexing.</p>
<p>But think the main thing here is whether updates then become the <em>only</em> way the indexes are refreshed. Andi has employed the simplest solution to avoid race conditions, with the rule that only one indexer may run at a time. But what it someone updates a page while the indexer is already running? Then you need some other mechanism to refresh the index later. And you also want to be able to reindex in case of corruption / data loss. Think the web bug approach makes the solution alot simpler.</p>]]></content:encoded>
	</item>
	<item>
		<title>By: shea</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10488</link>
		<dc:creator>shea</dc:creator>
		<pubDate>Fri, 04 Nov 2005 12:34:37 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10488</guid>
		<description>Ahh well the purpose of a cron job, which I'm sure you're aware of, is to run a specified job at set intervals.  So we are comparing apples to oranges if what you are after is a system that responds to events rather than set times.  This web bug trick would indeed be best for the latter.

However, I'm not sure I quite get why the indexer needs to be triggered by the web bug trick.  Why wouldn't the indexer be triggered when the process of saving the updates to the wiki occurs?</description>
		<content:encoded><![CDATA[<p>Ahh well the purpose of a cron job, which I&#8217;m sure you&#8217;re aware of, is to run a specified job at set intervals.  So we are comparing apples to oranges if what you are after is a system that responds to events rather than set times.  This web bug trick would indeed be best for the latter.</p>
<p>However, I&#8217;m not sure I quite get why the indexer needs to be triggered by the web bug trick.  Why wouldn&#8217;t the indexer be triggered when the process of saving the updates to the wiki occurs?</p>]]></content:encoded>
	</item>
	<item>
		<title>By: HarryF</title>
		<link>http://www.sitepoint.com/blogs/2005/11/03/web-bugs-for-job-scheduling-hack-or-solution/#comment-10485</link>
		<dc:creator>HarryF</dc:creator>
		<pubDate>Fri, 04 Nov 2005 11:16:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.sitepoint.com/blogs/?p=1283#comment-10485</guid>
		<description>&lt;blockquote&gt;
No, why would you not use cron when you have the option to? Right tool for the job yeah? However, if the question was—would you use this “dodgy hack” if cron wasn’t available? My answer would be a (obviously) resounding yes.
&lt;/blockquote&gt;

Agreed for general task scheduling but what about the argument that the web bug approach is better integrated with the application than cron can be? If you want something that's triggered by events happening within your application (such as content being updated via a form), a web bug has a better chance of being able to respond directly the to the event (although it can't be relied on! - lynx users or those still surfing with images disabled, for example, would be a problem).

In other words think there are a class of problems that can be better solved this way than with cron, such as the Dokuwiki indexer. It's not a clear distinction but the way Dokuwiki's indexer works is popular wiki pages have a better chance of getting re-indexed if they change (in fact they'd often be re-indexed immediately after editing, if not other indexer is running). By indexing only one page at a time the overhead is kept reasonably spread. The alternative with cron would likely be something that has to make complete sweeps of all pages and index those that have changed. Each time that job runs it could result is a serious resource hit. Implementing a smarter solution with cron, which spreads the load, would probably turn out more complex than using a web bug.

Should also have mentioned that PHP's session garbage collector works on a similar basis - it's incoming requests that fire the garbage collector. If you have no visitors, the session GC won't run (so expired sessions will still be hanging around).

Also, along the lines of George's tip regarding images, given an environment you control, if you ran the web bug under a seperate server, like &lt;a href="http://halplant.com:88/server/thttpd_FAQ.html#PHP" rel="nofollow"&gt;thhptd&lt;/a&gt; or even &lt;a href="http://nanoweb.si.kz/" rel="nofollow"&gt;nanoweb&lt;/a&gt;, on a subdomain, you're no longer blocking Apache children.

One other thought (dare I say it?) - this could also work well with AJAX, especially if you need to pass values to the web bug. That also sounds like a legitimate use of AJAX...</description>
		<content:encoded><![CDATA[<blockquote><p>
No, why would you not use cron when you have the option to? Right tool for the job yeah? However, if the question was—would you use this “dodgy hack” if cron wasn’t available? My answer would be a (obviously) resounding yes.
</p></blockquote>
<p>Agreed for general task scheduling but what about the argument that the web bug approach is better integrated with the application than cron can be? If you want something that&#8217;s triggered by events happening within your application (such as content being updated via a form), a web bug has a better chance of being able to respond directly the to the event (although it can&#8217;t be relied on! - lynx users or those still surfing with images disabled, for example, would be a problem).</p>
<p>In other words think there are a class of problems that can be better solved this way than with cron, such as the Dokuwiki indexer. It&#8217;s not a clear distinction but the way Dokuwiki&#8217;s indexer works is popular wiki pages have a better chance of getting re-indexed if they change (in fact they&#8217;d often be re-indexed immediately after editing, if not other indexer is running). By indexing only one page at a time the overhead is kept reasonably spread. The alternative with cron would likely be something that has to make complete sweeps of all pages and index those that have changed. Each time that job runs it could result is a serious resource hit. Implementing a smarter solution with cron, which spreads the load, would probably turn out more complex than using a web bug.</p>
<p>Should also have mentioned that PHP&#8217;s session garbage collector works on a similar basis - it&#8217;s incoming requests that fire the garbage collector. If you have no visitors, the session GC won&#8217;t run (so expired sessions will still be hanging around).</p>
<p>Also, along the lines of George&#8217;s tip regarding images, given an environment you control, if you ran the web bug under a seperate server, like <a href="http://halplant.com:88/server/thttpd_FAQ.html#PHP" rel="nofollow">thhptd</a> or even <a href="http://nanoweb.si.kz/" rel="nofollow">nanoweb</a>, on a subdomain, you&#8217;re no longer blocking Apache children.</p>
<p>One other thought (dare I say it?) - this could also work well with AJAX, especially if you need to pass values to the web bug. That also sounds like a legitimate use of AJAX&#8230;</p>]]></content:encoded>
	</item>
</channel>
</rss>
