Catching click on a link

Hi folks.

I’m looking for reminders on how to …

  1. Get all links in a page.

  2. Apply an onclick event handler,

  3. that cancels the usual link click event

  4. and puts the link address into a form element

1 =

	var links = document.getElementsByTagName('A');

2 =

	for (var i=0; i<links.length; i++) {
		links[i].onclick = setLink (links[i]);
	}

3 = ??? cancelEvent? (Should be added to the function in 4 I guess)

4 =

function setLink (lnk) {
	document.getElementById('scrapetarget').value = lnk;
}

And I have an initialisation code of

function addLoadEvent(func) {
    var oldonload = window.onload;
    if (typeof window.onload != 'function') {
        window.onload = func;
    } else {
        window.onload = function() {
            oldonload();
            func();
        }
    }
}

addLoadEvent (doLinks);

Wrapping it all together …

<script type="text/javascript">
function addLoadEvent(func) {
    var oldonload = window.onload;
    if (typeof window.onload != 'function') {
        window.onload = func;
    } else {
        window.onload = function() {
            oldonload();
            func();
        }
    }
}

function setLink (lnk) {
	document.getElementById('scrapetarget').value = lnk;
}

function doLinks() {
	var links = document.getElementsByTagName('A');
	for (var i=0; i<links.length; i++) {
		links[i].onclick = setLink (links[i]);
	}
}

addLoadEvent (doLinks);
</script>

What’s behind the question is a screen scraper test I’m doing, whereby I call a website address via a form, and clicking on any page link should put the target address into the form and then submit it. I think I’m on the right tracks - it’s just cancelling the original click event I’m not sure about.

TIA. :slight_smile:

For the cancelling of the usual behaviour, you’re looking for

return false;

in the onclick method.

Adding to what Stomme said, I’d use this as it makes it neater, and a small improvement to the for loop by not having to find links.length every iteration. Finally, your setLink function will work, but for consistency’s sake I would actually access the href attribute directly:

function setLink () {
	document.getElementById('scrapetarget').value = this.href; // this is the element clicked
	return false; // cancel default action
}

function doLinks() {
	var links = document.getElementsByTagName('A');
	for (var i=0, j = links.length; I < j; i++) {
		links[i].onclick = setLink;
	}
}

Thanks guys.

This test requires the target link to be entered into a form, which is then automatically submitted. The form uses POST - if GET is used (such as passing the target link via a querystring) then the firewall catches the keywords in the link and blocks the target … using POST circumvents this. :wink:

That’s why I’m not sure about return false … I’ve to submit the form before I can use the return, so I’m not sure whether I’ll get a race condition - will the form submit before the link click action tries to fire?

I’ll have a play. :slight_smile:

Tried to edit my post but I’d exceeded the time limit. :frowning:

Update is it works fine - I just need to tweak my PHP code now to handle various link structures.

Thanks again. :slight_smile:

All complete and appears to be working fine (within reason - doesn’t work for https). Thanks once again. :slight_smile:

Here’s the complete code if anyone’s interested.

<?php
//Screen scrape test
$target = $_POST['target'];
if (strlen ($target) > 5) {
	if (strpos ($target, 'ttp://') === false) $target = 'http://' . $target;
	$slash = strrpos ($target, '/') + 1;
	if ($slash < 9) $slash = strlen ($target);
	$domain = substr ($target, 0, $slash);
	if (strrpos ($domain, '/') != strlen ($domain) - 1) $domain .= '/';

	$slash = strpos ($target, '/', 8) + 1;
	if ($slash == 1) $slash = strlen ($target);
	$root = substr ($target, 0, $slash);
	if (strrpos ($root, '/') != strlen ($root) - 1) $root .= '/';
	$out = file_get_contents ($target, false);
	if ($out !== false) {
		//Find local links and insert current domain
		$out = str_ireplace ('href="', 'href="~~~', $out);
		$out = str_ireplace ('href="~~~http://', 'href="http://', $out);
		$out = str_ireplace ('href="~~~mailto:', 'href="mailto:', $out);
		$out = str_ireplace ('href="~~~', 'href="'.$domain, $out);

		//fix local image paths
		$out = str_ireplace ('src="', 'src="~~~', $out);
		$out = str_ireplace ('src="~~~http://', 'src="http://', $out);
		$out = str_ireplace ('src="~~~', 'src="'.$domain, $out);

		echo $out;
	}
}
?>
<script type="text/javascript">
function addLoadEvent(func) {
    var oldonload = window.onload;
    if (typeof window.onload != 'function') {
        window.onload = func;
    } else {
        window.onload = function() {
            oldonload();
            func();
        }
    }
}

function setLink () {
    document.getElementById('scrapetarget').value = this.href; // this is the element clicked
	document.scrapeform.submit();
    return false; // cancel default action
}

function doLinks() {
    var links = document.getElementsByTagName('A');
    for (var i=0, j = links.length; i < j; i++) {
        links[i].onclick = setLink;
    }
}

addLoadEvent (doLinks);
</script>
<?php
echo '<hr>';
echo '<form name="scrapeform" method="post" action=""><input id="scrapetarget" type="text" name="target" size="60" value="'.$target.'"></form>';
?>

Edit: just noticed the forum code has inserted [ url] tags - these should not be there. :expressionless: