SitePoint Sponsor

User Tag List

Results 1 to 25 of 25

Hybrid View

  1. #1
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    PHP And Redirect Question

    What I am trying to do:

    URL-X redirects to URL-Y.
    These URL's are external sites that I do not own.

    How can php be used to output something like "URL-X redirects to URL-Y".

    Example for clarification:

    1. redirect.com/redirect.php redirects to example.com/somepage.php.
    2. When giving php the url redirect.com/redirect.php as input,
    how can php output "redirect.com/redirect.php redirects to example.com/somepage.php".

    Appreciate your answers

  2. #2
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  3. #3
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by adam.jimenez View Post
    why not use htaccess instead?
    That is not an option in my case. The redirects itself is on external sites which I don't own or control. I just want to use php find the target url.

    Say, you have an affiliate link on one site that will redirect to a product on another site. Neither of these sites belongs to me. I want php to output the target url for the affiliate link.

    The php code will then be something like this, I guess:

    $redirect_link = "[affiliate url here]"

    $target_url = [ php code to find the url which $redirect_link redirects to ]

    So, the thing I want to do with php is to find the $target_url. Any suggestions?

  4. #4
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by terrwacos View Post
    That is not an option in my case. The redirects itself is on external sites which I don't own or control. I just want to use php find the target url.

    Say, you have an affiliate link on one site that will redirect to a product on another site. Neither of these sites belongs to me. I want php to output the target url for the affiliate link.

    The php code will then be something like this, I guess:

    $redirect_link = "[affiliate url here]"

    $target_url = [ php code to find the url which $redirect_link redirects to ]

    So, the thing I want to do with php is to find the $target_url. Any suggestions?
    I think I know what you mean now.
    maybe if u do a fopen on the redirect link you could read the headers.
    http://uk2.php.net/manual/en/function.fopen.php

    that is of course if they are redirecting using the headers and not meta-tags or some other method.

  5. #5
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by adam.jimenez View Post
    maybe if u do a fopen on the redirect link you could read the headers.
    Thanks for the idea. I tried:

    Code:
    print fopen('$affiliate_link', "r" );
    and the output I got was:

    Code:
    Resource id #5
    Any suggestions on what I could do next? (and yes I am a php n00b)

  6. #6
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ok try this - replace the host and path as appropriate:

    PHP Code:
    <?php
    $fp 
    fsockopen("www.example.com"80$errno$errstr30);
    if (!
    $fp) {
        echo 
    "$errstr ($errno)<br />\n";
    } else {
        
    $out "GET /filename.php HTTP/1.1\r\n";
        
    $out .= "Host: www.example.com\r\n";
        
    $out .= "Connection: Close\r\n\r\n";
        
    fwrite($fp$out);
        while (!
    feof($fp)) {
            
    $line=fgets($fp1024);
            
            if (
    stristr($line,"location:")) {
                
    $redirect=preg_replace("/location:/i","",$line);
                break;
            }
        }
    }

    print 
    $redirect;
    ?>

  7. #7
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks a lot! That worked

    One small step left though: Your code produces a long string, for simplicity lets say it is some random text, and then the target url, and then more random text.

    CdsafVBsadgYafDgUagIKdsagJsaSdgAasdNBDHS5342534ADVJ%2Fwww example c0m%3Fflkjahsjf897324982374183hljksadfhkhfsadkl

    The target url needs to be extracted. For reference purposes, maybe someone more php savvy than me could provide an answer on how to extract "www example c0m" from the string above?

    (PS: had to edit the dot com, "unfortunately you don't yet have enough posts under your belt to post links")

  8. #8
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by terrwacos View Post
    Thanks a lot! That worked

    One small step left though: Your code produces a long string, for simplicity lets say it is some random text, and then the target url, and then more random text.

    CdsafVBsadgYafDgUagIKdsagJsaSdgAasdNBDHS5342534ADVJ%2Fwww example c0m%3Fflkjahsjf897324982374183hljksadfhkhfsadkl

    The target url needs to be extracted. For reference purposes, maybe someone more php savvy than me could provide an answer on how to extract "www example c0m" from the string above?

    (PS: had to edit the dot com, "unfortunately you don't yet have enough posts under your belt to post links")

    if it's really random u'd need to use regex.

    see preg_match:
    http://uk2.php.net/preg_match

  9. #9
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I created a function using adam.jimenez's code, but there seems to be a problem when I use this many times over:

    Code:
    function geturl($redirect) {
    	$fp = fsockopen("$redirect", 80, $errno, $errstr, 30); 
    	if (!$fp) { 
    		// echo "$errstr ($errno)<br />\n"; 
    	} else { 
    		$out = "GET /index.php HTTP/1.1\r\n"; 
    		$out .= "Host: ".$redirect."\r\n"; 
    		$out .= "Connection: Close\r\n\r\n"; 
    		fwrite($fp, $out); 
    		while (!feof($fp)) { 
    			$line=fgets($fp, 1024); 
    			 
    			if (stristr($line,"location:")) { 
    				$url=preg_replace("/location:/i","",$line); 
    				break; 
    			} 
    		} 
    	}
    	
    	// Then some code to extract the www.url.tld part from header info, this works fine.
    
    }
    For the first 1000 url's this worked great, but then I started getting error messages:

    Code:
    Warning: fsockopen() [function.fsockopen]: unable to connect to [_____redirect_link_here_____]:80 (Connection timed out) in [____location_to_php_file_where_function_is_stored_____] on line 6
    Connection timed out (110)
    The links themselves is not the problem, as they all have the same format, and they will redirect if I simply paste the url in the address bar in my browser.

    I have tried to portion the requests by doing only 1 or 10 at the time, but didn't help. The function does not seem to work as well for big operations, I need to find the target url of about 40.000 redirect url's.

    Any suggestions on why this happens? Is there restrictions on my host, or is it the external sites (host for redirect url's) that is the problem? Or might it be the code itself?

    Appreciate your answers as always

  10. #10
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    thanks a lot adam.jimenez, you've been a great help

  11. #11
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by terrwacos View Post
    thanks a lot adam.jimenez, you've been a great help
    you're welcome

  12. #12
    SitePoint Wizard silver trophybronze trophy Stormrider's Avatar
    Join Date
    Sep 2006
    Location
    Nottingham, UK
    Posts
    3,133
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Why are you using preg_replace for a search? That will remove the 'location:' text but nothing else, and it would likely be header('Location: url'); at least.

  13. #13
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Stormrider View Post
    Why are you using preg_replace for a search? That will remove the 'location:' text but nothing else, and it would likely be header('Location: url'); at least.
    uh nope . we are reading http headers not php code. so there is no header().

  14. #14
    SitePoint Wizard silver trophybronze trophy Stormrider's Avatar
    Join Date
    Sep 2006
    Location
    Nottingham, UK
    Posts
    3,133
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Ah ok, I thought you were opening the php file itself and looking for the header(location) call. Would probably still be better to use preg_match though, you aren't just removing location:, you want to match the url after it.

  15. #15
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You should send HEAD instead of GET and properly parse the headers.

    There are also other ways to do it (besides fsockopen), like with PHP's streams or CURL.

    Unfortunately, I have to dash, so I can't provide code at the moment.

    Edit:

    How to do it with streams (easy): http://php.net/manual/en/wrappers.http.php
    (first code example)
    Last edited by sk89q; Jun 11, 2009 at 18:08.

  16. #16
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by sk89q View Post
    You should send HEAD instead of GET and properly parse the headers.

    There are also other ways to do it (besides fsockopen), like with PHP's streams or CURL.

    Unfortunately, I have to dash, so I can't provide code at the moment.

    Edit:

    How to do it with streams (easy): http://php.net/manual/en/wrappers.http.php
    (first code example)
    that is easier - thanks for the tip.

  17. #17
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I also tried a function using the example with streams that sk89q was linking to. Still, for about 1/3 of the links I get the "Connection timed out" error message.

  18. #18
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by terrwacos View Post
    I also tried a function using the example with streams that sk89q was linking to. Still, for about 1/3 of the links I get the "Connection timed out" error message.
    sounds like you are hitting a limit somewhere.

    try using fclose in the loop to free up the connections..

    or if that doesn't work try adding a sleep command in the loop which might also free up the connections.

  19. #19
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by adam.jimenez View Post
    sounds like you are hitting a limit somewhere.

    try using fclose in the loop to free up the connections..

    or if that doesn't work try adding a sleep command in the loop which might also free up the connections.
    Tried it and it didn't make a difference. Thanks for the advice though.

    Maybe contacting my host and ask about limitations wil be the next step.

    EDIT: Ok, seems like it's not a problem with my host. I tried with several other hosting accounts with various companies, and the same thing happens now. So the limit I'm hitting must either be with the host that provides the redirect url's, or it's with my IP. Maybe waiting a few hours is the only solution...

  20. #20
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by terrwacos View Post
    Tried it and it didn't make a difference. Thanks for the advice though.

    Maybe contacting my host and ask about limitations wil be the next step.
    u could link it up to a database so that everytime u have to restart the script it can continue where it left off..

  21. #21
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I hope that you're not giving a full URL as a parameter to geturl(). It only accepts hostnames and IP addresses.

    If you need it to be fast, then the ideal way would to be to use threads, but that's not possible in PHP.

  22. #22
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by sk89q View Post
    I hope that you're not giving a full URL as a parameter to geturl(). It only accepts hostnames and IP addresses.

    If you need it to be fast, then the ideal way would to be to use threads, but that's not possible in PHP.
    It did work with URLs for me (at least for the first 2000 redirect URLs that I used with this function).

    The error mentioned above is for redirect URLs from one particular site (witch redirect to a bunch of different sites).

    I have now tried the script from different IPs, different machines and different location, so it's not a local error.

  23. #23
    SitePoint Wizard bronze trophy C. Ankerstjerne's Avatar
    Join Date
    Jan 2004
    Location
    The Kingdom of Denmark
    Posts
    2,702
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    My guess is that your requests are coming in too fast, overloading the external server.
    Christian Ankerstjerne
    <p<strong<abbr/HTML/ 4 teh win</>
    <>In Soviet Russia, website codes you!

  24. #24
    SitePoint Zealot adam.jimenez's Avatar
    Join Date
    May 2009
    Location
    Ware, UK
    Posts
    136
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by C. Ankerstjerne View Post
    My guess is that your requests are coming in too fast, overloading the external server.
    but apparently adding "sleep" to the loop doesn't help..

  25. #25
    SitePoint Member
    Join Date
    Jun 2009
    Posts
    10
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Eventually I found a way around this problem. What I did was not a perfect solution in any way, but it worked.

    So here's what I did. I said earlier that approx 1/3 of the feedback from the script sent me errors.

    What I did was that I used php.ini to remove the warning messages, then copy the processed material that did not get errors into the database, and then try another request with the reqiests that did not get processed properly. Then I did this, over and over, until there was no more errors. (In other words, the numbers of errors were reduced by 1/3 for every time I run the script.)

    This way I will first get 1/3 errors, then at second try only 1/3 errors of the original 1/3 errors, then third try 1/3 errors on the errors from the second try. At the end of the day it did just what I was looking for, and it didn't take that much longer.


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •