What Is This Php Method Called For Tracking Links On Foreign Domains?

uniqueideaman · May 12, 2017, 12:17pm

Php Experts,

I am a complete beginner in Php. Trying to learn Php 7 as much as I can. I need your help to point me to the right direction.
What kind of method in Php 7 is used to track urls belonging to a foreign domain ?
I mean, you all know that searchengines track clicks to links presented by their SERPs. So, if sitepoint.com is listed under the keywords “php forum” on Google SERP then when you click sitepoint.com on the Google SERP then that link is bound to have Google’s tracking link. In short, you’ll click the Google tracking link and then get redirected to sitepoint.com. But once you enter sitepoint.com then the links you find on the site would not have Google’s tracking links as Google has no control over the linking method inside sitepoint.com.
However, with anonymous web proxies the story is different. Let me show you how.
First go to:
http://anonymouse.org
and type ‘sitepoint.com’ to anonymously browse the forum.
Once the web proxy loads the forum, hover your mouse over the blog’s links and you’ll see all the links contain the web proxy’s tracking links even though the forum itself does not house the web proxy’s tracking links.
Go to the forum page and see for yourself and check it out:
http://www.sitepoint.com
Now, view the same page via the web proxy:
http://anonymouse.org/cgi-bin/anon-www.sitepoint.com
and hover your mouse over the links and see whether the web proxy’s tracking links exist or not. I know the proxy is somehow tracking via the frame or iframe or cloaking technique but …

Q1. What php 7 technique is it using to create these tracking links so it looks as if the tracking links are hosted by the blog ?
I need to learn how to create these tracking links.
Q2. Which part of Php 7 do I need to learn and can you recommend some tutorial links ?
Q3. Are you aware of any video tutorials on youtube ? What keywords should I search for ?
affiliate linking with php 7 ?
referral linking with php 7 ?
self replicating linking with php 7 ?
tracking links with php 7 ?
tracking via GET method ?
tracking via POST method ?
Anything else ?
Thank you for your help. Your answers would be helpful to all newbies ho read your reply!

spaceshiptrooper · May 12, 2017, 12:26pm

You’re talking about Shorten URLs I think. It also isn’t PHP 7 as it existed long before PHP 7 came out.

chorn · May 12, 2017, 12:33pm

a proxy which features ‘url rewriting’ is already what you are searching for. just have alook at php proxy with url rewrite. there’s a packet called miniproxy.php at github. but you have to implement the tracking-part.

uniqueideaman · May 12, 2017, 12:52pm

No. No. No. Come-on man! You’re smarter than that! Don’t tell me you have never tried building a tracker that tracks foreign links inside your frame/iframe.

Let me explain again.
First, headover to this page:

Then, hover your mouse over “About the Book” but don’t click it. You’ll see the url is shown by your browser as:
http://probloggerbook.com/about/

Now, headover to:
http://anonymouse.org/anonwww.html
And, anonymously browse:
http://probloggerbook.com/

Finally, hover your mouse over “About the Book” but don’t click it. You’ll see the url is shown by your browser as:
http://anonymouse.org/cgi-bin/anon-www.cgi/http://probloggerbook.com/about/

You see what is going on here ? Even though your on probloggerbook.com which has no affiliation with anonymouse.org, you’ll see when you hover your mouse over any links on probloggerbook.com that their links are preceded by the proxy’s domain “http://anonymouse.org/cgi-bin/anon-www.cgi/” in order for the proxy to track your clicks.
Now, I want to build similar. So, what is this method called where you ad your own domain infront of the targeting website’s domain ? What method do these proxies use to track your clicks on foreign domains ? If I can learn the name of the method then I can google it for a tutorial or youtube it. Understand ?
I think chorn knows what I’m talking about.

uniqueideaman · May 12, 2017, 1:28pm

Checking out your suggestions now. I think if I learn how to build a proxy then I will come across that aspect which deals with tracking foreign domains. And so, do you know of any good text or video tutorial on the subject ? I’m googling and youtube searching now but don’t count on me to find the right stuffs and so would be grateful if you can find some yourself and recommend a few.
Using others’ scripts usually come with restrictions. Therefore, best I learn how to build one myself and then build it according to my specs. What do you think ?
For the time being, I’ve copied the php code of miniproxy (gpl) onto my Xampp and it’s working and doing what I want. However, noticed it doesn’t work on httpS sites but not to worry as I didn’t want my proxy working on secure pages anyway.
But still, I will await for your recommended tutorial that will show me how to build my own php proxy. Checked about 10 youtube tutorials and the’yre crap.

uniqueideaman · May 12, 2017, 1:47pm

Guys,

Does anyone have any php tutorials (either text or video) to recommend to me to learn how to build my own php proxy ? You see, I’m actually trying to give you a frame on my website where you can use that to navigate to websites. I want to be able to track what websites and webpages you visit either by typing direct the url or by clicking links. Don;t ask me why would any visit use such a frame only to be tracked because that’s a long story. I just need your suggestions on the technical parts.
I was suggested to look into url_rewriting which I am looking into now and was suggested MiniProxy which seems to be doing the job I wanted.

However, saying all this, I need to learn how to build such a proxy and so on the lookout for upto date tutorials. Youtube is yielding a lot of outdated tutorials on cURL and hardly any vids show you how to program/write your own php proxy. I tried understanding the codes that made-up the miniProxy but it’s seeming too complicated for a beginner. Having to do a lot of guess work which part of the code does what and that is not a proper way to learn. Thefore, hunting the perfect beginner level tutorial.

Any suggesions welcome!

Thanks!

spaceshiptrooper · May 12, 2017, 2:54pm

You can still track number of views or whatever you are attempting using Shorten URLs. Really makes no sense to go out of your way just to have /www.mydomain.com at the end of a redirect URL. Really makes no sense because that same affect can be achieved using Shorten URLs for a redirection.

uniqueideaman · May 12, 2017, 9:24pm

You mean shorten urls like using services like Tiny Url, Bit.ly? In that case, I’d have to manually shorten each and every link.
Imagine, I’m trying to run my own proxy server. Write one of my own. In order to do that, I must learn the method name of this form of url tracking on foreign domains. Anyway, I am investigating url_rewriting now as I am positive Chorn figured-out what I was looking for.

Just curious, since you keep talking about url shortening, is there a php method to do that ? Any function ? Might aswell check it out to see if it’s got url tracking capability or not. Now, where’s your recommended url shortening php tutorial ? I ain’t just googling as too many outdated tuts.

spaceshiptrooper · May 12, 2017, 10:30pm

Well, it really depends on what you are talking about. If all you really want is to track what people are viewing then there really is no point in having a Shorten URL or a “proxy” server. These are generally for other uses. What you should be doing is creating a single file for tracking which pages the user views by including that single file in all of your files. Then just store those URLs in a table and output it where only you can see it.

You’re making this a lot harder than it seems. You don’t need iframes, Shorten URLs, or even “proxy” servers for that matter if all you’re doing is just tracking what pages a user visits.

uniqueideaman · May 12, 2017, 11:33pm

If I was going to just track which urls from my own domain get clicked then I know how to do it.
On everyone of my links, I’ll just add my tracking url, like so:

But like I said, I want to track what my users do on other sites.
For example, let’s say I run a searchengine like google. And a user does a keyword search and my searchengine presents results on it’s SERP (Searchengine Result Page).
Now, when a user clicks a foreign domain link then I would be able to track it because all the result links would carry my tracker url prior to their own url.
So, if my SERP was listing: fart.com then it would be foolish for my SERP to directly list:
http://fart.com

Instead, it would list like this:
http://mydomain.com/tracker.php?http://fart.com
(Google already does that sort of tracking).
And ofcourse, my tracker.php page would use the GET method to capture the destination url and log it before redirecting the visitor to it. I know how to do all that after I learnt from a few youtube tutorials on how to build your own member log/reg site. They taught how to code to build an account activation link that needs to be clicked by the user from his email to confirm his email. Don’t worry about that. I know how to build a tracker. I’ll just fiddle with the code that I learnt about the activation link and if I get stuck then you guys are here to help.
Now, once the user clicks a link on my SERP, I’d be able to track the click before he is redirected to the destination. But once his inside the destination site (foreign domain) I won’t be able to track his movements from page to page on the third party site. I want to track his movements for my own ingenious purposes and don’t object that I shouldn’t do this by saying the user won’t allow it. The user WILL allow it for their benefit and most likely YOU too but that’s a different story I won’t get into right now and spoil the fun. Wait and see what I pull out of my magician jacket in the upcoming few days. In the meanwhile, take a deep breath and hold it and don’t let go unless I’ve brought Jacueline outside her box (lol!).
I have my own ways to pursuade users to allow it and I’ll try it on you one day. Lol!
Now, the problem is, how on earth can I track the user’s movements while he is clicking away on a foreign domain ? I noticed back in 1998/2000 that proxies show third party pages under a frame or whatever or by using cloaking methods or whatever and therefore was on the hunt a few mnths back to learn how they do it. Still looking for the right text and/or vid tutorial that’ll teach me how to do all this. No matter how many forums and how many programmers I ask about this, not a single one can answer me and provide the name of the method used by the proxies save one programmer (tonight) who understood my intentions and suggested I look into url_rewriting which I will do now. He suggested I use the Gpl miniProxy. (In the past I downloaded Glype but that has restrictions in it’s license) I have copied and pasted miniProxy’s one page code onto my Xampp and wallah it is working as expected! Meaning, no matter what url is shown to the user, I am now able to proxify each and every link he clicks and proxify each and every page he views.
Let me explain again.
First, headover to this page:
http://probloggerbook.com/
Then, hover your mouse over “About the Book” but don’t click it. You’ll see the url is shown by your browser as:
http://probloggerbook.com/about/

Now, headover to:
http://anonymouse.org/anonwww.html
And, anonymously browse:
http://probloggerbook.com/

Finally, hover your mouse over “About the Book” but don’t click it. You’ll see the url is shown by your browser as:
http://anonymouse.org/cgi-bin/anon-www.cgi/http://probloggerbook.com/about/

You see what is going on here ? Even though your on probloggerbook.com which has no affiliation with anonymouse.org, you’ll see when you hover your mouse over any links on probloggerbook.com that their links are preceded by the proxy’s domain “http://anonymouse.org/cgi-bin/anon-www.cgi/” in order for the proxy to start proxifying the new pages the visitor visits via their proxy. But, I don’t think the proxy tracks the clicks and I am going to change this with your help.
So, from our example above, http://anonymouse.org was preceding “http://anonymouse.org/cgi-bin/anon-www.cgi/” to every link. In other words, they are preceding their own domain on every links the user clicks. The minProxy does this. So, if my miniProxy is installed on my “proxy” folder inside my public_html, then it will precede the following path on all links:

http://mydomain.com/proxy

Now, I need to add my tracker url so the miniProxy on every proxified pages/links precedes likes so:

http://mydomain.com/proxy/tracker.php?url=

Now, which part of the code in the following code do I change/replace and what do I replace it with to achieve my purpose ?


<?php
/*
miniProxy - A simple PHP web proxy. <https://github.com/joshdick/miniProxy>
Written and maintained by Joshua Dick <http://joshdick.net>.
miniProxy is licensed under the GNU GPL v3 <http://www.gnu.org/licenses/gpl.html>.
*/
/****************************** START CONFIGURATION ******************************/
//To allow proxying any URL, set $whitelistPatterns to an empty array (the default).
//To only allow proxying of specific URLs (whitelist), add corresponding regular expressions
//to the $whitelistPatterns array. Enter the most specific patterns possible, to prevent possible abuse.
//You can optionally use the "getHostnamePattern()" helper function to build a regular expression that
//matches all URLs for a given hostname.
$whitelistPatterns = array(
  //Usage example: To support any URL at example.net, including sub-domains, uncomment the
  //line below (which is equivalent to [ @^https?://([a-z0-9-]+\.)*example\.net@i ]):
  //getHostnamePattern("example.net")
);
//To enable CORS (cross-origin resource sharing) for proxied sites, set $forceCORS to true.
$forceCORS = false;
//URL that will be used as an example in the instructional text on the miniProxy landing page,
//and that will be proxied when pressing the 'Proxy It!' button on the landing page
//if the URL form is left blank.
$exampleURL = 'https://example.net';
/****************************** END CONFIGURATION ******************************/
ob_start("ob_gzhandler");
if (version_compare(PHP_VERSION, "5.4.7", "<")) {
    die("miniProxy requires PHP version 5.4.7 or later.");
}
if (!function_exists("curl_init")) die("miniProxy requires PHP's cURL extension. Please install/enable it on your server and try again.");
//Helper function for use inside $whitelistPatterns.
//Returns a regex that matches all HTTP[S] URLs for a given hostname.
function getHostnamePattern($hostname) {
  $escapedHostname = str_replace(".", "\.", $hostname);
  return "@^https?://([a-z0-9-]+\.)*" . $escapedHostname . "@i";
}
//Helper function used to removes/unset keys from an associative array using case insensitive matching
function removeKeys(&$assoc, $keys2remove) {
  $keys = array_keys($assoc);
  $map = array();
  foreach ($keys as $key) {
     $map[strtolower($key)] = $key;
  }
  foreach ($keys2remove as $key) {
    $key = strtolower($key);
    if (isset($map[$key])) {
       unset($assoc[$map[$key]]);
    }
  }
}
if (!function_exists("getallheaders")) {
  //Adapted from http://www.php.net/manual/en/function.getallheaders.php#99814
  function getallheaders() {
    $result = array();
    foreach($_SERVER as $key => $value) {
      if (substr($key, 0, 5) == "HTTP_") {
        $key = str_replace(" ", "-", ucwords(strtolower(str_replace("_", " ", substr($key, 5)))));
        $result[$key] = $value;
      }
    }
    return $result;
  }
}
$usingDefaultPort =  (!isset($_SERVER["HTTPS"]) && $_SERVER["SERVER_PORT"] === 80) || (isset($_SERVER["HTTPS"]) && $_SERVER["SERVER_PORT"] === 443);
$prefixPort = $usingDefaultPort ? "" : ":" . $_SERVER["SERVER_PORT"];
//Use HTTP_HOST to support client-configured DNS (instead of SERVER_NAME), but remove the port if one is present
$prefixHost = $_SERVER["HTTP_HOST"];
$prefixHost = strpos($prefixHost, ":") ? implode(":", explode(":", $_SERVER["HTTP_HOST"], -1)) : $prefixHost;
define("PROXY_PREFIX", "http" . (isset($_SERVER["HTTPS"]) ? "s" : "") . "://" . $prefixHost . $prefixPort . $_SERVER["SCRIPT_NAME"] . "?");
//Makes an HTTP request via cURL, using request data that was passed directly to this script.
function makeRequest($url) {
  //Tell cURL to make the request using the brower's user-agent if there is one, or a fallback user-agent otherwise.
  $user_agent = $_SERVER["HTTP_USER_AGENT"];
  if (empty($user_agent)) {
    $user_agent = "Mozilla/5.0 (compatible; miniProxy)";
  }
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
  //Get ready to proxy the browser's request headers...
  $browserRequestHeaders = getallheaders();
  //...but let cURL set some headers on its own.
  removeKeys($browserRequestHeaders, array(
    "Host",
    "Content-Length",
    "Accept-Encoding" //Throw away the browser's Accept-Encoding header if any and let cURL make the request using gzip if possible.
  ));
  curl_setopt($ch, CURLOPT_ENCODING, "");
  //Transform the associative array from getallheaders() into an
  //indexed array of header strings to be passed to cURL.
  $curlRequestHeaders = array();
  foreach ($browserRequestHeaders as $name => $value) {
    $curlRequestHeaders[] = $name . ": " . $value;
  }
  curl_setopt($ch, CURLOPT_HTTPHEADER, $curlRequestHeaders);
  //Proxy any received GET/POST/PUT data.
  switch ($_SERVER["REQUEST_METHOD"]) {
    case "POST":
      curl_setopt($ch, CURLOPT_POST, true);
      //For some reason, $HTTP_RAW_POST_DATA isn't working as documented at
      //http://php.net/manual/en/reserved.variables.httprawpostdata.php
      //but the php://input method works. This is likely to be flaky
      //across different server environments.
      //More info here: http://stackoverflow.com/questions/8899239/http-raw-post-data-not-being-populated-after-upgrade-to-php-5-3
      //If the miniProxyFormAction field appears in the POST data, remove it so the destination server doesn't receive it.
      $postData = Array();
      parse_str(file_get_contents("php://input"), $postData);
      if (isset($postData["miniProxyFormAction"])) {
        unset($postData["miniProxyFormAction"]);
      }
      curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postData));
    break;
    case "PUT":
      curl_setopt($ch, CURLOPT_PUT, true);
      curl_setopt($ch, CURLOPT_INFILE, fopen("php://input", "r"));
    break;
  }
  //Other cURL options.
  curl_setopt($ch, CURLOPT_HEADER, true);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  //Set the request URL.
  curl_setopt($ch, CURLOPT_URL, $url);
  //Make the request.
  $response = curl_exec($ch);
  $responseInfo = curl_getinfo($ch);
  $headerSize = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
  curl_close($ch);
  //Setting CURLOPT_HEADER to true above forces the response headers and body
  //to be output together--separate them.
  $responseHeaders = substr($response, 0, $headerSize);
  $responseBody = substr($response, $headerSize);
  return array("headers" => $responseHeaders, "body" => $responseBody, "responseInfo" => $responseInfo);
}
//Converts relative URLs to absolute ones, given a base URL.
//Modified version of code found at http://nashruddin.com/PHP_Script_for_Converting_Relative_to_Absolute_URL
function rel2abs($rel, $base) {
  if (empty($rel)) $rel = ".";
  if (parse_url($rel, PHP_URL_SCHEME) != "" || strpos($rel, "//") === 0) return $rel; //Return if already an absolute URL
  if ($rel[0] == "#" || $rel[0] == "?") return $base.$rel; //Queries and anchors
  extract(parse_url($base)); //Parse base URL and convert to local variables: $scheme, $host, $path
  $path = isset($path) ? preg_replace("#/[^/]*$#", "", $path) : "/"; //Remove non-directory element from path
  if ($rel[0] == "/") $path = ""; //Destroy path if relative url points to root
  $port = isset($port) && $port != 80 ? ":" . $port : "";
  $auth = "";
  if (isset($user)) {
    $auth = $user;
    if (isset($pass)) {
      $auth .= ":" . $pass;
    }
    $auth .= "@";
  }
  $abs = "$auth$host$port$path/$rel"; //Dirty absolute URL
  for ($n = 1; $n > 0; $abs = preg_replace(array("#(/\.?/)#", "#/(?!\.\.)[^/]+/\.\./#"), "/", $abs, -1, $n)) {} //Replace '//' or '/./' or '/foo/../' with '/'
  return $scheme . "://" . $abs; //Absolute URL is ready.
}
//Proxify contents of url() references in blocks of CSS text.
function proxifyCSS($css, $baseURL) {
  // Add a "url()" wrapper to any CSS @import rules that only specify a URL without the wrapper,
  // so that they're proxified when searching for "url()" wrappers below.
  $sourceLines = explode("\n", $css);
  $normalizedLines = [];
  foreach ($sourceLines as $line) {
    if (preg_match("/@import\s+url/i", $line)) {
      $normalizedLines[] = $line;
    } else {
      $normalizedLines[] = preg_replace_callback(
        "/(@import\s+)([^;\s]+)([\s;])/i",
        function($matches) use ($baseURL) {
          return $matches[1] . "url(" . $matches[2] . ")" . $matches[3];
        },
        $line);
    }
  }
  $normalizedCSS = implode("\n", $normalizedLines);
  return preg_replace_callback(
    "/url\((.*?)\)/i",
    function($matches) use ($baseURL) {
        $url = $matches[1];
        //Remove any surrounding single or double quotes from the URL so it can be passed to rel2abs - the quotes are optional in CSS
        //Assume that if there is a leading quote then there should be a trailing quote, so just use trim() to remove them
        if (strpos($url, "'") === 0) {
          $url = trim($url, "'");
        }
        if (strpos($url, "\"") === 0) {
          $url = trim($url, "\"");
        }
        if (stripos($url, "data:") === 0) return "url(" . $url . ")"; //The URL isn't an HTTP URL but is actual binary data. Don't proxify it.
        return "url(" . PROXY_PREFIX . rel2abs($url, $baseURL) . ")";
    },
    $normalizedCSS);
}
//Proxify "srcset" attributes (normally associated with <img> tags.)
function proxifySrcset($srcset, $baseURL) {
  $sources = array_map("trim", explode(",", $srcset)); //Split all contents by comma and trim each value
  $proxifiedSources = array_map(function($source) use ($baseURL) {
    $components = array_map("trim", str_split($source, strrpos($source, " "))); //Split by last space and trim
    $components[0] = PROXY_PREFIX . rel2abs(ltrim($components[0], "/"), $baseURL); //First component of the split source string should be an image URL; proxify it
    return implode($components, " "); //Recombine the components into a single source
  }, $sources);
  $proxifiedSrcset = implode(", ", $proxifiedSources); //Recombine the sources into a single "srcset"
  return $proxifiedSrcset;
}
//Extract and sanitize the requested URL, handling cases where forms have been rewritten to point to the proxy.
if (isset($_POST["miniProxyFormAction"])) {
  $url = $_POST["miniProxyFormAction"];
  unset($_POST["miniProxyFormAction"]);
} else {
  $queryParams = Array();
  parse_str($_SERVER["QUERY_STRING"], $queryParams);
  //If the miniProxyFormAction field appears in the query string, make $url start with its value, and rebuild the the query string without it.
  if (isset($queryParams["miniProxyFormAction"])) {
    $formAction = $queryParams["miniProxyFormAction"];
    unset($queryParams["miniProxyFormAction"]);
    $url = $formAction . "?" . http_build_query($queryParams);
  } else {
    $url = substr($_SERVER["REQUEST_URI"], strlen($_SERVER["SCRIPT_NAME"]) + 1);
  }
}
if (empty($url)) {
    die("<html><head><title>miniProxy</title></head><body><h1>Welcome to miniProxy!</h1>miniProxy can be directly invoked like this: <a href=\"" . PROXY_PREFIX . $exampleURL . "\">" . PROXY_PREFIX . $exampleURL . "</a><br /><br />Or, you can simply enter a URL below:<br /><br /><form onsubmit=\"if (document.getElementById('site').value) { window.location.href='" . PROXY_PREFIX . "' + document.getElementById('site').value; return false; } else { window.location.href='" . PROXY_PREFIX . $exampleURL . "'; return false; } \"><input id=\"site\" type=\"text\" size=\"50\" /><input type=\"submit\" value=\"Proxy It!\" /></form></body></html>");
} else if (strpos($url, ":/") !== strpos($url, "://")) {
    //Work around the fact that some web servers (e.g. IIS 8.5) change double slashes appearing in the URL to a single slash.
    //See https://github.com/joshdick/miniProxy/pull/14
    $pos = strpos($url, ":/");
    $url = substr_replace($url, "://", $pos, strlen(":/"));
}
$scheme = parse_url($url, PHP_URL_SCHEME);
if (empty($scheme)) {
  //Assume that any supplied URLs starting with // are HTTP URLs.
  if (strpos($url, "//") === 0) {
    $url = "http:" . $url;
  }
} else if (!preg_match("/^https?$/i", $scheme)) {
    die('Error: Detected a "' . $scheme . '" URL. miniProxy exclusively supports http[s] URLs.');
}
//Validate the requested URL against the whitelist.
$urlIsValid = count($whitelistPatterns) === 0;
foreach ($whitelistPatterns as $pattern) {
  if (preg_match($pattern, $url)) {
    $urlIsValid = true;
    break;
  }
}
if (!$urlIsValid) {
  die("Error: The requested URL was disallowed by the server administrator.");
}
$response = makeRequest($url);
$rawResponseHeaders = $response["headers"];
$responseBody = $response["body"];
$responseInfo = $response["responseInfo"];
//If CURLOPT_FOLLOWLOCATION landed the proxy at a diferent URL than
//what was requested, explicitly redirect the proxy there.
$responseURL = $responseInfo["url"];
if ($responseURL !== $url) {
  header("Location: " . PROXY_PREFIX . $responseURL, true);
  exit(0);
}
//A regex that indicates which server response headers should be stripped out of the proxified response.
$header_blacklist_pattern = "/^Content-Length|^Transfer-Encoding|^Content-Encoding.*gzip/i";
//cURL can make multiple requests internally (for example, if CURLOPT_FOLLOWLOCATION is enabled), and reports
//headers for every request it makes. Only proxy the last set of received response headers,
//corresponding to the final request made by cURL for any given call to makeRequest().
$responseHeaderBlocks = array_filter(explode("\r\n\r\n", $rawResponseHeaders));
$lastHeaderBlock = end($responseHeaderBlocks);
$headerLines = explode("\r\n", $lastHeaderBlock);
foreach ($headerLines as $header) {
  $header = trim($header);
  if (!preg_match($header_blacklist_pattern, $header)) {
    header($header, false);
  }
}
//Prevent robots from indexing proxified pages
header("X-Robots-Tag: noindex, nofollow", true);
if ($forceCORS) {
  //This logic is based on code found at: http://stackoverflow.com/a/9866124/278810
  //CORS headers sent below may conflict with CORS headers from the original response,
  //so these headers are sent after the original response headers to ensure their values
  //are the ones that actually end up getting sent to the browser.
  //Explicit [ $replace = true ] is used for these headers even though this is PHP's default behavior.
  //Allow access from any origin.
  header("Access-Control-Allow-Origin: *", true);
  header("Access-Control-Allow-Credentials: true", true);
  //Handle CORS headers received during OPTIONS requests.
  if ($_SERVER["REQUEST_METHOD"] == "OPTIONS") {
    if (isset($_SERVER["HTTP_ACCESS_CONTROL_REQUEST_METHOD"])) {
      header("Access-Control-Allow-Methods: GET, POST, OPTIONS", true);
    }
    if (isset($_SERVER["HTTP_ACCESS_CONTROL_REQUEST_HEADERS"])) {
      header("Access-Control-Allow-Headers: {$_SERVER['HTTP_ACCESS_CONTROL_REQUEST_HEADERS']}", true);
    }
    //No further action is needed for OPTIONS requests.
    exit(0);
  }
}
$contentType = "";
if (isset($responseInfo["content_type"])) $contentType = $responseInfo["content_type"];
//This is presumably a web page, so attempt to proxify the DOM.
if (stripos($contentType, "text/html") !== false) {
  //Attempt to normalize character encoding.
  $detectedEncoding = mb_detect_encoding($responseBody, "UTF-8, ISO-8859-1");
  if ($detectedEncoding) {
    $responseBody = mb_convert_encoding($responseBody, "HTML-ENTITIES", $detectedEncoding);
  }
  //Parse the DOM.
  $doc = new DomDocument();
  @$doc->loadHTML($responseBody);
  $xpath = new DOMXPath($doc);
  //Rewrite forms so that their actions point back to the proxy.
  foreach($xpath->query("//form") as $form) {
    $method = $form->getAttribute("method");
    $action = $form->getAttribute("action");
    //If the form doesn't have an action, the action is the page itself.
    //Otherwise, change an existing action to an absolute version.
    $action = empty($action) ? $url : rel2abs($action, $url);
    //Rewrite the form action to point back at the proxy.
    $form->setAttribute("action", rtrim(PROXY_PREFIX, "?"));
    //Add a hidden form field that the proxy can later use to retreive the original form action.
    $actionInput = $doc->createDocumentFragment();
    $actionInput->appendXML('<input type="hidden" name="miniProxyFormAction" value="' . htmlspecialchars($action) . '" />');
    $form->appendChild($actionInput);
  }
  //Proxify <meta> tags with an 'http-equiv="refresh"' attribute.
  foreach ($xpath->query("//meta[@http-equiv]") as $element) {
    if (strcasecmp($element->getAttribute("http-equiv"), "refresh") === 0) {
      $content = $element->getAttribute("content");
      if (!empty($content)) {
        $splitContent = preg_split("/=/", $content);
        if (isset($splitContent[1])) {
          $element->setAttribute("content", $splitContent[0] . "=" . PROXY_PREFIX . rel2abs($splitContent[1], $url));
        }
      }
    }
  }
  //Profixy <style> tags.
  foreach($xpath->query("//style") as $style) {
    $style->nodeValue = proxifyCSS($style->nodeValue, $url);
  }
  //Proxify tags with a "style" attribute.
  foreach ($xpath->query("//*[@style]") as $element) {
    $element->setAttribute("style", proxifyCSS($element->getAttribute("style"), $url));
  }
  //Proxify "srcset" attributes in <img> tags.
  foreach ($xpath->query("//img[@srcset]") as $element) {
    $element->setAttribute("srcset", proxifySrcset($element->getAttribute("srcset"), $url));
  }
  //Proxify any of these attributes appearing in any tag.
  $proxifyAttributes = array("href", "src");
  foreach($proxifyAttributes as $attrName) {
    foreach($xpath->query("//*[@" . $attrName . "]") as $element) { //For every element with the given attribute...
      $attrContent = $element->getAttribute($attrName);
      if ($attrName == "href" && preg_match("/^(about|javascript|magnet|mailto):/i", $attrContent)) continue;
      $attrContent = rel2abs($attrContent, $url);
      $attrContent = PROXY_PREFIX . $attrContent;
      $element->setAttribute($attrName, $attrContent);
    }
  }
  //Attempt to force AJAX requests to be made through the proxy by
  //wrapping window.XMLHttpRequest.prototype.open in order to make
  //all request URLs absolute and point back to the proxy.
  //The rel2abs() JavaScript function serves the same purpose as the server-side one in this file,
  //but is used in the browser to ensure all AJAX request URLs are absolute and not relative.
  //Uses code from these sources:
  //http://stackoverflow.com/questions/7775767/javascript-overriding-xmlhttprequest-open
  //https://gist.github.com/1088850
  //TODO: This is obviously only useful for browsers that use XMLHttpRequest but
  //it's better than nothing.
  $head = $xpath->query("//head")->item(0);
  $body = $xpath->query("//body")->item(0);
  $prependElem = $head != NULL ? $head : $body;
  //Only bother trying to apply this hack if the DOM has a <head> or <body> element;
  //insert some JavaScript at the top of whichever is available first.
  //Protects against cases where the server sends a Content-Type of "text/html" when
  //what's coming back is most likely not actually HTML.
  //TODO: Do this check before attempting to do any sort of DOM parsing?
  if ($prependElem != NULL) {
    $scriptElem = $doc->createElement("script",
      '(function() {
        if (window.XMLHttpRequest) {
          function parseURI(url) {
            var m = String(url).replace(/^\s+|\s+$/g, "").match(/^([^:\/?#]+:)?(\/\/(?:[^:@]*(?::[^:@]*)?@)?(([^:\/?#]*)(?::(\d*))?))?([^?#]*)(\?[^#]*)?(#[\s\S]*)?/);
            // authority = "//" + user + ":" + pass "@" + hostname + ":" port
            return (m ? {
              href : m[0] || "",
              protocol : m[1] || "",
              authority: m[2] || "",
              host : m[3] || "",
              hostname : m[4] || "",
              port : m[5] || "",
              pathname : m[6] || "",
              search : m[7] || "",
              hash : m[8] || ""
            } : null);
          }
          function rel2abs(base, href) { // RFC 3986
            function removeDotSegments(input) {
              var output = [];
              input.replace(/^(\.\.?(\/|$))+/, "")
                .replace(/\/(\.(\/|$))+/g, "/")
                .replace(/\/\.\.$/, "/../")
                .replace(/\/?[^\/]*/g, function (p) {
                  if (p === "/..") {
                    output.pop();
                  } else {
                    output.push(p);
                  }
                });
              return output.join("").replace(/^\//, input.charAt(0) === "/" ? "/" : "");
            }
            href = parseURI(href || "");
            base = parseURI(base || "");
            return !href || !base ? null : (href.protocol || base.protocol) +
            (href.protocol || href.authority ? href.authority : base.authority) +
            removeDotSegments(href.protocol || href.authority || href.pathname.charAt(0) === "/" ? href.pathname : (href.pathname ? ((base.authority && !base.pathname ? "/" : "") + base.pathname.slice(0, base.pathname.lastIndexOf("/") + 1) + href.pathname) : base.pathname)) +
            (href.protocol || href.authority || href.pathname ? href.search : (href.search || base.search)) +
            href.hash;
          }
          var proxied = window.XMLHttpRequest.prototype.open;
          window.XMLHttpRequest.prototype.open = function() {
              if (arguments[1] !== null && arguments[1] !== undefined) {
                var url = arguments[1];
                url = rel2abs("' . $url . '", url);
                url = "' . PROXY_PREFIX . '" + url;
                arguments[1] = url;
              }
              return proxied.apply(this, [].slice.call(arguments));
          };
        }
      })();'
    );
    $scriptElem->setAttribute("type", "text/javascript");
    $prependElem->insertBefore($scriptElem, $prependElem->firstChild);
  }
  echo "<!-- Proxified page constructed by miniProxy -->\n" . $doc->saveHTML();
} else if (stripos($contentType, "text/css") !== false) { //This is CSS, so proxify url() references.
  echo proxifyCSS($responseBody, $url);
} else { //This isn't a web page or CSS, so serve unmodified through the proxy with the correct headers (images, JavaScript, etc.)
  header("Content-Length: " . strlen($responseBody), true);
  echo $responseBody;
}

Here is the full script:

https://github.com/joshdick/miniProxy

If you know of any better script then do let me know by pointing out which part of the code to replace with what. Be 100% precise in your descriptions or I will get lost or drown in the code.

PS - I’m dozing off to sleep. Might roll out of my chair and so I will look into your answer/reply after I wake-up. Therefore, take your time in wading through the code to pin-point where I should add my tracker url. And, don’t forget this thread and don’t delay in answering this post.

PPS - Don’t worry, I won’t name the directory “proxy” as I know websites filter that keyword.

Thanks!

spaceshiptrooper · May 13, 2017, 12:37am

Yes, I know what you are talking about. But if all these contents and links are on your website and if you control those contents or links, there’s really no reason to use any of these methods as you can practically track your own views. But you cannot dictate what other people have put in their website. So if you “want” to be able to see which links the users have clicked on, the best approach again would be to use Shorten URLs. With Shorten URL, you can basically keep track of how much that link is clicked and so on. Take SitePoint’s forum for example. Do you see all those number counts on links people click? Those get tracked most likely in the same way. Well, not with Shorten URL, but Discourse most likely uses a similar method where they every time someone clicks on a link, that numbers increases.

I’d say if you control the contents or links, all you really need to do is have a single file track those clicks. You really can’t modify what people have on their website. You can, but it only affects you, not your user. So say you have Google’s website in your iframe. Whatever the user clicks within that iframe, you have no control over. External links can never be modified on a 3rd party. That anonymous website you linked, they don’t actually “keep” track of who clicks on what and who does what. If they did, they can only keep track of whatever the user types into the text field. All that http://anonymouse.org/cgi-bin/anon-www.cgi/http://probloggerbook.com/about/ stuff wouldn’t really matter because they most likely aren’t keeping track of what the user clicks since they have no control over what the user sees or clicks.

For example, if a user uses anonymouse.org and they type in google.com. Then they type whatever into the Google search bar and they find results and click on those links, anonymouse.org would NOT be able to keep track of whatever links the user clicks on. That would be Google who will then keep track of what the user clicks. So basically, you HAVE NO control over what the user clicks and you can’t really track what the user does outside of your iframe. Whatever page you load up in your iframe, you can keep track if the iframe is linked to your own website.

So for example. I have an iframe linking to SitePoint’s main website and we’ll just use localhost.com as a primary domain example.

<iframe src="https://sitepoint.com/"></iframe>

Whatever is within that link, I have no control over. So the user can click on whatever page or link they want. I can’t track that at all. But say you have

<iframe src="https://localhost.com/iframe/https://sitepoint.com/"></iframe>

Then you can only keep track of that first page. When the user clicks on links within Sitepoint, you can’t track that since you don’t own Sitepoint.

Also, what you are trying to do, I don’t think you’ll get the user’s original IP Address at all. The code for the current user’s IP Address is

$_SERVER['REMOTE_ADDR'];

So if you “proxify” the links with this method of yours, it won’t work because “proxy” usually hides the user’s IP Address. So what you are essentially doing is hiding the user’s real IP Address with whatever the the “proxy” uses (e.g. cURL or file_get_contents). You then will get a bunch of random data that returns false IP Addresses because when you “proxy” something, you’re essentially hiding it from the original website.

I probably bet there is a way, but what you are essentially doing is just making it harder for yourself since you have no control over what the user clicks on.

uniqueideaman · May 14, 2017, 1:22am

Bro,

Do me a favour.
Follow the following steps and give me your results:

Fire-up Xampp.
Create a folder “PROXY” in the “localhost”. Then, create file: test.php
Add the following code into test.php:


<?php
/*
miniProxy - A simple PHP web proxy. <https://github.com/joshdick/miniProxy>
Written and maintained by Joshua Dick <http://joshdick.net>.
miniProxy is licensed under the GNU GPL v3 <http://www.gnu.org/licenses/gpl.html>.
*/
/****************************** START CONFIGURATION ******************************/
//To allow proxying any URL, set $whitelistPatterns to an empty array (the default).
//To only allow proxying of specific URLs (whitelist), add corresponding regular expressions
//to the $whitelistPatterns array. Enter the most specific patterns possible, to prevent possible abuse.
//You can optionally use the "getHostnamePattern()" helper function to build a regular expression that
//matches all URLs for a given hostname.
$whitelistPatterns = array(
  //Usage example: To support any URL at example.net, including sub-domains, uncomment the
  //line below (which is equivalent to [ @^https?://([a-z0-9-]+\.)*example\.net@i ]):
  //getHostnamePattern("example.net")
);
//To enable CORS (cross-origin resource sharing) for proxied sites, set $forceCORS to true.
$forceCORS = false;
//URL that will be used as an example in the instructional text on the miniProxy landing page,
//and that will be proxied when pressing the 'Proxy It!' button on the landing page
//if the URL form is left blank.
$exampleURL = 'https://example.net';
/****************************** END CONFIGURATION ******************************/
ob_start("ob_gzhandler");
if (version_compare(PHP_VERSION, "5.4.7", "<")) {
    die("miniProxy requires PHP version 5.4.7 or later.");
}
if (!function_exists("curl_init")) die("miniProxy requires PHP's cURL extension. Please install/enable it on your server and try again.");
//Helper function for use inside $whitelistPatterns.
//Returns a regex that matches all HTTP[S] URLs for a given hostname.
function getHostnamePattern($hostname) {
  $escapedHostname = str_replace(".", "\.", $hostname);
  return "@^https?://([a-z0-9-]+\.)*" . $escapedHostname . "@i";
}
//Helper function used to removes/unset keys from an associative array using case insensitive matching
function removeKeys(&$assoc, $keys2remove) {
  $keys = array_keys($assoc);
  $map = array();
  foreach ($keys as $key) {
     $map[strtolower($key)] = $key;
  }
  foreach ($keys2remove as $key) {
    $key = strtolower($key);
    if (isset($map[$key])) {
       unset($assoc[$map[$key]]);
    }
  }
}
if (!function_exists("getallheaders")) {
  //Adapted from http://www.php.net/manual/en/function.getallheaders.php#99814
  function getallheaders() {
    $result = array();
    foreach($_SERVER as $key => $value) {
      if (substr($key, 0, 5) == "HTTP_") {
        $key = str_replace(" ", "-", ucwords(strtolower(str_replace("_", " ", substr($key, 5)))));
        $result[$key] = $value;
      }
    }
    return $result;
  }
}
$usingDefaultPort =  (!isset($_SERVER["HTTPS"]) && $_SERVER["SERVER_PORT"] === 80) || (isset($_SERVER["HTTPS"]) && $_SERVER["SERVER_PORT"] === 443);
$prefixPort = $usingDefaultPort ? "" : ":" . $_SERVER["SERVER_PORT"];
//Use HTTP_HOST to support client-configured DNS (instead of SERVER_NAME), but remove the port if one is present
$prefixHost = $_SERVER["HTTP_HOST"];
$prefixHost = strpos($prefixHost, ":") ? implode(":", explode(":", $_SERVER["HTTP_HOST"], -1)) : $prefixHost;
define("PROXY_PREFIX", "http" . (isset($_SERVER["HTTPS"]) ? "s" : "") . "://" . $prefixHost . $prefixPort . $_SERVER["SCRIPT_NAME"] . "?");
//Makes an HTTP request via cURL, using request data that was passed directly to this script.
function makeRequest($url) {
  //Tell cURL to make the request using the brower's user-agent if there is one, or a fallback user-agent otherwise.
  $user_agent = $_SERVER["HTTP_USER_AGENT"];
  if (empty($user_agent)) {
    $user_agent = "Mozilla/5.0 (compatible; miniProxy)";
  }
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
  //Get ready to proxy the browser's request headers...
  $browserRequestHeaders = getallheaders();
  //...but let cURL set some headers on its own.
  removeKeys($browserRequestHeaders, array(
    "Host",
    "Content-Length",
    "Accept-Encoding" //Throw away the browser's Accept-Encoding header if any and let cURL make the request using gzip if possible.
  ));
  curl_setopt($ch, CURLOPT_ENCODING, "");
  //Transform the associative array from getallheaders() into an
  //indexed array of header strings to be passed to cURL.
  $curlRequestHeaders = array();
  foreach ($browserRequestHeaders as $name => $value) {
    $curlRequestHeaders[] = $name . ": " . $value;
  }
  curl_setopt($ch, CURLOPT_HTTPHEADER, $curlRequestHeaders);
  //Proxy any received GET/POST/PUT data.
  switch ($_SERVER["REQUEST_METHOD"]) {
    case "POST":
      curl_setopt($ch, CURLOPT_POST, true);
      //For some reason, $HTTP_RAW_POST_DATA isn't working as documented at
      //http://php.net/manual/en/reserved.variables.httprawpostdata.php
      //but the php://input method works. This is likely to be flaky
      //across different server environments.
      //More info here: http://stackoverflow.com/questions/8899239/http-raw-post-data-not-being-populated-after-upgrade-to-php-5-3
      //If the miniProxyFormAction field appears in the POST data, remove it so the destination server doesn't receive it.
      $postData = Array();
      parse_str(file_get_contents("php://input"), $postData);
      if (isset($postData["miniProxyFormAction"])) {
        unset($postData["miniProxyFormAction"]);
      }
      curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postData));
    break;
    case "PUT":
      curl_setopt($ch, CURLOPT_PUT, true);
      curl_setopt($ch, CURLOPT_INFILE, fopen("php://input", "r"));
    break;
  }
  //Other cURL options.
  curl_setopt($ch, CURLOPT_HEADER, true);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  //Set the request URL.
  curl_setopt($ch, CURLOPT_URL, $url);
  //Make the request.
  $response = curl_exec($ch);
  $responseInfo = curl_getinfo($ch);
  $headerSize = curl_getinfo($ch, CURLINFO_HEADER_SIZE);
  curl_close($ch);
  //Setting CURLOPT_HEADER to true above forces the response headers and body
  //to be output together--separate them.
  $responseHeaders = substr($response, 0, $headerSize);
  $responseBody = substr($response, $headerSize);
  return array("headers" => $responseHeaders, "body" => $responseBody, "responseInfo" => $responseInfo);
}
//Converts relative URLs to absolute ones, given a base URL.
//Modified version of code found at http://nashruddin.com/PHP_Script_for_Converting_Relative_to_Absolute_URL
function rel2abs($rel, $base) {
  if (empty($rel)) $rel = ".";
  if (parse_url($rel, PHP_URL_SCHEME) != "" || strpos($rel, "//") === 0) return $rel; //Return if already an absolute URL
  if ($rel[0] == "#" || $rel[0] == "?") return $base.$rel; //Queries and anchors
  extract(parse_url($base)); //Parse base URL and convert to local variables: $scheme, $host, $path
  $path = isset($path) ? preg_replace("#/[^/]*$#", "", $path) : "/"; //Remove non-directory element from path
  if ($rel[0] == "/") $path = ""; //Destroy path if relative url points to root
  $port = isset($port) && $port != 80 ? ":" . $port : "";
  $auth = "";
  if (isset($user)) {
    $auth = $user;
    if (isset($pass)) {
      $auth .= ":" . $pass;
    }
    $auth .= "@";
  }
  $abs = "$auth$host$port$path/$rel"; //Dirty absolute URL
  for ($n = 1; $n > 0; $abs = preg_replace(array("#(/\.?/)#", "#/(?!\.\.)[^/]+/\.\./#"), "/", $abs, -1, $n)) {} //Replace '//' or '/./' or '/foo/../' with '/'
  return $scheme . "://" . $abs; //Absolute URL is ready.
}
//Proxify contents of url() references in blocks of CSS text.
function proxifyCSS($css, $baseURL) {
  // Add a "url()" wrapper to any CSS @import rules that only specify a URL without the wrapper,
  // so that they're proxified when searching for "url()" wrappers below.
  $sourceLines = explode("\n", $css);
  $normalizedLines = [];
  foreach ($sourceLines as $line) {
    if (preg_match("/@import\s+url/i", $line)) {
      $normalizedLines[] = $line;
    } else {
      $normalizedLines[] = preg_replace_callback(
        "/(@import\s+)([^;\s]+)([\s;])/i",
        function($matches) use ($baseURL) {
          return $matches[1] . "url(" . $matches[2] . ")" . $matches[3];
        },
        $line);
    }
  }
  $normalizedCSS = implode("\n", $normalizedLines);
  return preg_replace_callback(
    "/url\((.*?)\)/i",
    function($matches) use ($baseURL) {
        $url = $matches[1];
        //Remove any surrounding single or double quotes from the URL so it can be passed to rel2abs - the quotes are optional in CSS
        //Assume that if there is a leading quote then there should be a trailing quote, so just use trim() to remove them
        if (strpos($url, "'") === 0) {
          $url = trim($url, "'");
        }
        if (strpos($url, "\"") === 0) {
          $url = trim($url, "\"");
        }
        if (stripos($url, "data:") === 0) return "url(" . $url . ")"; //The URL isn't an HTTP URL but is actual binary data. Don't proxify it.
        return "url(" . PROXY_PREFIX . rel2abs($url, $baseURL) . ")";
    },
    $normalizedCSS);
}
//Proxify "srcset" attributes (normally associated with <img> tags.)
function proxifySrcset($srcset, $baseURL) {
  $sources = array_map("trim", explode(",", $srcset)); //Split all contents by comma and trim each value
  $proxifiedSources = array_map(function($source) use ($baseURL) {
    $components = array_map("trim", str_split($source, strrpos($source, " "))); //Split by last space and trim
    $components[0] = PROXY_PREFIX . rel2abs(ltrim($components[0], "/"), $baseURL); //First component of the split source string should be an image URL; proxify it
    return implode($components, " "); //Recombine the components into a single source
  }, $sources);
  $proxifiedSrcset = implode(", ", $proxifiedSources); //Recombine the sources into a single "srcset"
  return $proxifiedSrcset;
}
//Extract and sanitize the requested URL, handling cases where forms have been rewritten to point to the proxy.
if (isset($_POST["miniProxyFormAction"])) {
  $url = $_POST["miniProxyFormAction"];
  unset($_POST["miniProxyFormAction"]);
} else {
  $queryParams = Array();
  parse_str($_SERVER["QUERY_STRING"], $queryParams);
  //If the miniProxyFormAction field appears in the query string, make $url start with its value, and rebuild the the query string without it.
  if (isset($queryParams["miniProxyFormAction"])) {
    $formAction = $queryParams["miniProxyFormAction"];
    unset($queryParams["miniProxyFormAction"]);
    $url = $formAction . "?" . http_build_query($queryParams);
  } else {
    $url = substr($_SERVER["REQUEST_URI"], strlen($_SERVER["SCRIPT_NAME"]) + 1);
  }
}
if (empty($url)) {
    die("<html><head><title>miniProxy</title></head><body><h1>Welcome to miniProxy!</h1>miniProxy can be directly invoked like this: <a href=\"" . PROXY_PREFIX . $exampleURL . "\">" . PROXY_PREFIX . $exampleURL . "</a><br /><br />Or, you can simply enter a URL below:<br /><br /><form onsubmit=\"if (document.getElementById('site').value) { window.location.href='" . PROXY_PREFIX . "' + document.getElementById('site').value; return false; } else { window.location.href='" . PROXY_PREFIX . $exampleURL . "'; return false; } \"><input id=\"site\" type=\"text\" size=\"50\" /><input type=\"submit\" value=\"Proxy It!\" /></form></body></html>");
} else if (strpos($url, ":/") !== strpos($url, "://")) {
    //Work around the fact that some web servers (e.g. IIS 8.5) change double slashes appearing in the URL to a single slash.
    //See https://github.com/joshdick/miniProxy/pull/14
    $pos = strpos($url, ":/");
    $url = substr_replace($url, "://", $pos, strlen(":/"));
}
$scheme = parse_url($url, PHP_URL_SCHEME);
if (empty($scheme)) {
  //Assume that any supplied URLs starting with // are HTTP URLs.
  if (strpos($url, "//") === 0) {
    $url = "http:" . $url;
  }
} else if (!preg_match("/^https?$/i", $scheme)) {
    die('Error: Detected a "' . $scheme . '" URL. miniProxy exclusively supports http[s] URLs.');
}
//Validate the requested URL against the whitelist.
$urlIsValid = count($whitelistPatterns) === 0;
foreach ($whitelistPatterns as $pattern) {
  if (preg_match($pattern, $url)) {
    $urlIsValid = true;
    break;
  }
}
if (!$urlIsValid) {
  die("Error: The requested URL was disallowed by the server administrator.");
}
$response = makeRequest($url);
$rawResponseHeaders = $response["headers"];
$responseBody = $response["body"];
$responseInfo = $response["responseInfo"];
//If CURLOPT_FOLLOWLOCATION landed the proxy at a diferent URL than
//what was requested, explicitly redirect the proxy there.
$responseURL = $responseInfo["url"];
if ($responseURL !== $url) {
  header("Location: " . PROXY_PREFIX . $responseURL, true);
  exit(0);
}
//A regex that indicates which server response headers should be stripped out of the proxified response.
$header_blacklist_pattern = "/^Content-Length|^Transfer-Encoding|^Content-Encoding.*gzip/i";
//cURL can make multiple requests internally (for example, if CURLOPT_FOLLOWLOCATION is enabled), and reports
//headers for every request it makes. Only proxy the last set of received response headers,
//corresponding to the final request made by cURL for any given call to makeRequest().
$responseHeaderBlocks = array_filter(explode("\r\n\r\n", $rawResponseHeaders));
$lastHeaderBlock = end($responseHeaderBlocks);
$headerLines = explode("\r\n", $lastHeaderBlock);
foreach ($headerLines as $header) {
  $header = trim($header);
  if (!preg_match($header_blacklist_pattern, $header)) {
    header($header, false);
  }
}
//Prevent robots from indexing proxified pages
header("X-Robots-Tag: noindex, nofollow", true);
if ($forceCORS) {
  //This logic is based on code found at: http://stackoverflow.com/a/9866124/278810
  //CORS headers sent below may conflict with CORS headers from the original response,
  //so these headers are sent after the original response headers to ensure their values
  //are the ones that actually end up getting sent to the browser.
  //Explicit [ $replace = true ] is used for these headers even though this is PHP's default behavior.
  //Allow access from any origin.
  header("Access-Control-Allow-Origin: *", true);
  header("Access-Control-Allow-Credentials: true", true);
  //Handle CORS headers received during OPTIONS requests.
  if ($_SERVER["REQUEST_METHOD"] == "OPTIONS") {
    if (isset($_SERVER["HTTP_ACCESS_CONTROL_REQUEST_METHOD"])) {
      header("Access-Control-Allow-Methods: GET, POST, OPTIONS", true);
    }
    if (isset($_SERVER["HTTP_ACCESS_CONTROL_REQUEST_HEADERS"])) {
      header("Access-Control-Allow-Headers: {$_SERVER['HTTP_ACCESS_CONTROL_REQUEST_HEADERS']}", true);
    }
    //No further action is needed for OPTIONS requests.
    exit(0);
  }
}
$contentType = "";
if (isset($responseInfo["content_type"])) $contentType = $responseInfo["content_type"];
//This is presumably a web page, so attempt to proxify the DOM.
if (stripos($contentType, "text/html") !== false) {
  //Attempt to normalize character encoding.
  $detectedEncoding = mb_detect_encoding($responseBody, "UTF-8, ISO-8859-1");
  if ($detectedEncoding) {
    $responseBody = mb_convert_encoding($responseBody, "HTML-ENTITIES", $detectedEncoding);
  }
  //Parse the DOM.
  $doc = new DomDocument();
  @$doc->loadHTML($responseBody);
  $xpath = new DOMXPath($doc);
  //Rewrite forms so that their actions point back to the proxy.
  foreach($xpath->query("//form") as $form) {
    $method = $form->getAttribute("method");
    $action = $form->getAttribute("action");
    //If the form doesn't have an action, the action is the page itself.
    //Otherwise, change an existing action to an absolute version.
    $action = empty($action) ? $url : rel2abs($action, $url);
    //Rewrite the form action to point back at the proxy.
    $form->setAttribute("action", rtrim(PROXY_PREFIX, "?"));
    //Add a hidden form field that the proxy can later use to retreive the original form action.
    $actionInput = $doc->createDocumentFragment();
    $actionInput->appendXML('<input type="hidden" name="miniProxyFormAction" value="' . htmlspecialchars($action) . '" />');
    $form->appendChild($actionInput);
  }
  //Proxify <meta> tags with an 'http-equiv="refresh"' attribute.
  foreach ($xpath->query("//meta[@http-equiv]") as $element) {
    if (strcasecmp($element->getAttribute("http-equiv"), "refresh") === 0) {
      $content = $element->getAttribute("content");
      if (!empty($content)) {
        $splitContent = preg_split("/=/", $content);
        if (isset($splitContent[1])) {
          $element->setAttribute("content", $splitContent[0] . "=" . PROXY_PREFIX . rel2abs($splitContent[1], $url));
        }
      }
    }
  }
  //Profixy <style> tags.
  foreach($xpath->query("//style") as $style) {
    $style->nodeValue = proxifyCSS($style->nodeValue, $url);
  }
  //Proxify tags with a "style" attribute.
  foreach ($xpath->query("//*[@style]") as $element) {
    $element->setAttribute("style", proxifyCSS($element->getAttribute("style"), $url));
  }
  //Proxify "srcset" attributes in <img> tags.
  foreach ($xpath->query("//img[@srcset]") as $element) {
    $element->setAttribute("srcset", proxifySrcset($element->getAttribute("srcset"), $url));
  }
  //Proxify any of these attributes appearing in any tag.
  $proxifyAttributes = array("href", "src");
  foreach($proxifyAttributes as $attrName) {
    foreach($xpath->query("//*[@" . $attrName . "]") as $element) { //For every element with the given attribute...
      $attrContent = $element->getAttribute($attrName);
      if ($attrName == "href" && preg_match("/^(about|javascript|magnet|mailto):/i", $attrContent)) continue;
      $attrContent = rel2abs($attrContent, $url);
      $attrContent = PROXY_PREFIX . $attrContent;
      $element->setAttribute($attrName, $attrContent);
    }
  }
  //Attempt to force AJAX requests to be made through the proxy by
  //wrapping window.XMLHttpRequest.prototype.open in order to make
  //all request URLs absolute and point back to the proxy.
  //The rel2abs() JavaScript function serves the same purpose as the server-side one in this file,
  //but is used in the browser to ensure all AJAX request URLs are absolute and not relative.
  //Uses code from these sources:
  //http://stackoverflow.com/questions/7775767/javascript-overriding-xmlhttprequest-open
  //https://gist.github.com/1088850
  //TODO: This is obviously only useful for browsers that use XMLHttpRequest but
  //it's better than nothing.
  $head = $xpath->query("//head")->item(0);
  $body = $xpath->query("//body")->item(0);
  $prependElem = $head != NULL ? $head : $body;
  //Only bother trying to apply this hack if the DOM has a <head> or <body> element;
  //insert some JavaScript at the top of whichever is available first.
  //Protects against cases where the server sends a Content-Type of "text/html" when
  //what's coming back is most likely not actually HTML.
  //TODO: Do this check before attempting to do any sort of DOM parsing?
  if ($prependElem != NULL) {
    $scriptElem = $doc->createElement("script",
      '(function() {
        if (window.XMLHttpRequest) {
          function parseURI(url) {
            var m = String(url).replace(/^\s+|\s+$/g, "").match(/^([^:\/?#]+:)?(\/\/(?:[^:@]*(?::[^:@]*)?@)?(([^:\/?#]*)(?::(\d*))?))?([^?#]*)(\?[^#]*)?(#[\s\S]*)?/);
            // authority = "//" + user + ":" + pass "@" + hostname + ":" port
            return (m ? {
              href : m[0] || "",
              protocol : m[1] || "",
              authority: m[2] || "",
              host : m[3] || "",
              hostname : m[4] || "",
              port : m[5] || "",
              pathname : m[6] || "",
              search : m[7] || "",
              hash : m[8] || ""
            } : null);
          }
          function rel2abs(base, href) { // RFC 3986
            function removeDotSegments(input) {
              var output = [];
              input.replace(/^(\.\.?(\/|$))+/, "")
                .replace(/\/(\.(\/|$))+/g, "/")
                .replace(/\/\.\.$/, "/../")
                .replace(/\/?[^\/]*/g, function (p) {
                  if (p === "/..") {
                    output.pop();
                  } else {
                    output.push(p);
                  }
                });
              return output.join("").replace(/^\//, input.charAt(0) === "/" ? "/" : "");
            }
            href = parseURI(href || "");
            base = parseURI(base || "");
            return !href || !base ? null : (href.protocol || base.protocol) +
            (href.protocol || href.authority ? href.authority : base.authority) +
            removeDotSegments(href.protocol || href.authority || href.pathname.charAt(0) === "/" ? href.pathname : (href.pathname ? ((base.authority && !base.pathname ? "/" : "") + base.pathname.slice(0, base.pathname.lastIndexOf("/") + 1) + href.pathname) : base.pathname)) +
            (href.protocol || href.authority || href.pathname ? href.search : (href.search || base.search)) +
            href.hash;
          }
          var proxied = window.XMLHttpRequest.prototype.open;
          window.XMLHttpRequest.prototype.open = function() {
              if (arguments[1] !== null && arguments[1] !== undefined) {
                var url = arguments[1];
                url = rel2abs("' . $url . '", url);
                url = "' . PROXY_PREFIX . '" + url;
                arguments[1] = url;
              }
              return proxied.apply(this, [].slice.call(arguments));
          };
        }
      })();'
    );
    $scriptElem->setAttribute("type", "text/javascript");
    $prependElem->insertBefore($scriptElem, $prependElem->firstChild);
  }
  echo "<!-- Proxified page constructed by miniProxy -->\n" . $doc->saveHTML();
} else if (stripos($contentType, "text/css") !== false) { //This is CSS, so proxify url() references.
  echo proxifyCSS($responseBody, $url);
} else { //This isn't a web page or CSS, so serve unmodified through the proxy with the correct headers (images, JavaScript, etc.)
  header("Content-Length: " . strlen($responseBody), true);
  echo $responseBody;
}

Test the proxy script by navigating over to: C:\Xamp\htdocs\PROXY\test.php

What do you see ?
You see a page that looks like this:

Welcome to miniProxy!

miniProxy can be directly invoked like this: http://localhost:80/proxy/test.php?https://example.net

Or, you can simply enter a URL below:

Type the following url to proxify it:
http://localhost:80/proxy/test.php?http://probloggerbook.com/

Now, hover your mouse over the “About the Book” but don’t click it. What does the url get shown as ?
It gets shown as:
http://localhost:80/proxy/test.php?http://probloggerbook.com/about/

Note that the original url is:
http://probloggerbook.com/

But it gets shown to you by preceding: http://localhost:80/proxy/test.php?.

In other words, no matter what url you view via the proxy, it gets proxified by preceding “http://localhost:80/proxy/test.php?” onto the url. Like so:

http://localhost:80/proxy/test.php?http://probloggerbook.com/
http://localhost:80/proxy/test.php?http://ebay.com/

Those sites you are viewing do not host your http://localhost:80/proxy/test.php? links nor precede it but your proxy itself does it.
Now, as you can see the proxy is preceding “http://localhost:80/proxy/test.php?” in order to proxify your chosen url. But my idea is, instead of getting it to precede “http://localhost:80/proxy/test.php?”, why don’t we get it to precede “http://localhost:80/proxy/tracker.php?” instead. That way, our proxy tracker page script gets preceded and clicked in order for the tracker to log the user’s clicked url (foreign domain) before forwarding the user to the proxified page http://localhost:80/proxy/test.php?somedomain.com.

Now, can you figure-out which part of the code to replace with what to get the script to start logging ?
I mean, which part in the script do we make changes so the script no longer precedes http://localhost:80/proxy/test.php?
but precedes instead:
http://localhost:80/proxy/tracker.php?
that redirects the user to:
http://localhost:80/proxy/test.php?somedomain.com

In short, the script should not directly forward the user to the proxified page (the page user wants to view) but forward him to the tracker page in the middle as a doorway page before redirecting to the final destination.

Anyone are welcome to make replies!

mlukac89 · May 14, 2017, 4:07pm

WTF why you asking questions then lol

??

uniqueideaman · May 15, 2017, 12:15pm

That reply was to Space Ship Trooper who was objecting that a proxy cannot track internal pages of third party domains. I asked him to follow the steps I gave him then he can see things clearly for himself that it is possible and with php.

spaceshiptrooper · May 15, 2017, 12:23pm

No. It isn’t possible as I have said in my last post. It IS NOT possible because you spoof the user’s IP Address with cURL’s IP Address so all you’re going to get for 20 different users is 20 of the same IP Address or whatever the IP that cURL is currently using. cURL itself acts like a regular user so it uses its own IP Address. You can try it for yourself.

Create a tracker that stores IP Addresses
Then go to that page without using cURL
Then go to that page using cURL
Try the same method on a different network such as going to the library

You basically get 2 different IP Address. And cURL can use different IP Addresses at times. So tracking and using “proxy” won’t work at all because you’re going to be using cURL’s IP Address and not the original user’s. This goes with file_get_contents as well.

Gandalf · May 15, 2017, 12:39pm

[OFF-TOPIC]

I’m surprised this works. As the script says, the way to run it is normally
http://localhost/proxy/test.php
[\OFF-TOPIC]

uniqueideaman · May 15, 2017, 12:48pm

Don’t understand what you mean.
Since, it’s a one page script, no matter what path you put it in, aslong as you call your browser to that specific page, regardless of the path, it should work.

uniqueideaman · May 15, 2017, 1:03pm

I can try your method once I’ve got a little bit more experienced in cURL. Youtube bringing outdated tutorials where their codes don’t work or too brief. Otherwise, I would have attempted building my own proxy by now with cURL and not bother with third party ones which are seeming too complicated for me to fiddle with their code to tailor them to my needs. I was suggested a text tut on cURL by some programmer but again it was too brief.
Going hunting now for the right tutorial.

As for your final paragraph. I didn’t understand it.

I’d like to add that, I really have no intention of running a proxy to allow others to anonymously surf the net. Criminals would download illegal stuffs and probably do other illegal stuffs and get me in trouble.
I looked into the proxy because it had some function I wanted. Will save me from coding everything from top to bottom. That was the plan.
As you have figured by now, I just want to give my users an iframe/frame so they can browse the net with it. I just want to log what they’re browsing. Pretty much like Team Viewer where you can see from your end what’s on my screen. Only difference would be, you won’t control my computer while with a Team Viewer session you can.
I come across lots of good text readings and videos. Sick and tired copying & pasting their links to forums or email messages for others to view what I am viewing or have viewed. Not all pages have the TWEET or LIKE or SHARE button.
With my project, I’ll just browse the net in an iframe/frame/proxy and the tracker will log every page I visit and others can view on their end what I’m viewing without me having to hit the LIKE/SHARE/TWEET button. Others like family, friends and new buddies (like yourselves). Ofcourse, I have another goal to all this but that will go off-topic and so I won’t mention it now.

I have one question though. What is your main point ? I probably would have understood you better had I managed to understand your final paragraph but going through it a few times has resulted in me getting no-where.
Are you saying that, when I track a proxy user then that will defeat the purpose of him using a proxy ?
Don’t worry about that. Like I said, the whole proxy issue is not for giving users to surf the net anonymously. It’s actually for tracking purpose.

PS - cURL uses it’s own ip is indeed news to me. I thought it will use the web server’s ip. The proxy I mean. And since, cURL would be part of the proxy then it’s ip and the server’s ip should be the same. Anyway, good if another ip appears out of the blue. Might make use of it one day.

uniqueideaman · May 15, 2017, 9:25pm

Trooper my man,

I managed to do it with a little hiccup here and there. Do you want to have a look at the script ? I’m glad that you can learn something a little from me and my findings for a change after all the contributions this forum has made to me so far.

In short, 2 nights ago, I said to myself, instead of redirecting the user from the proxified page to the tracker and then back to the proxified page, why don’t I just add the tracker code onto the proxy php file ? That way, no need for 2 files. The proxy page and the tracker page.
Anyway, I found a programmer confirming it can be done and he suggested what I was thinking to do (add the tracker code to the main proxy code file) and showed me where to make the changes. Doing it, worked. He said, if I try redirecting the user from the proxified page to the tracker page then the execution would be complete and no way can I redirect the user back to the proxified page. Therefore, his suggestion was to just add the tracking code in the same file where the code for the proxy is residing. Infact, the original script was one page anyway.

Anyone else interested in seeing the script ? It’s not perfect but doing what programmers were saying in 2 forums (including this one) that what I am planning to do is impossible. With the help from a 3rd forum, I achieved what experienced guys were saying is impossible. Now, isn’t that something ?
If you learn what I’m upto with the script, you’ll agree I live upto my username “Unique Idea Man”. Now, I’m thinking of changing it to “Mr Possible”. Lol!
Btw, is anyone interested to learn what I’m upto ? Just don’t pull your hairs saying “Why didn’t I think of that ?”. Lol!

spaceshiptrooper · May 16, 2017, 12:52am

I like your optimism, but here’s the thing. Whoever told you that it’s possible is lying through their teeth and whatever they showed you probably is what I already was talking about. And here’s the big reason why what you are proposing WILL NOT work.

It violates the internet laws

What does that mean? It basically means that what you are proposing is to be able to inject any cookie or session or tracker on ANY website as you see fit. This will

NOT

work.

And just in case you say that “it’s not what you were talking about”. Nope. THAT IS EXACTLY WHAT YOU ARE TALKING ABOUT. Because imagine you being able to track people’s activities on other sites that you don’t own. What does that tell you? It means that the internet is broken in such a way that there is no need for anti-virus or firewalls anymore. There WOULD BE NO point in having any protection from anyone. That is why it WILL NOT work.

You can try it for yourself if you want. Try doing what you are suggesting on SitePoint. I will bet you all my money in my bank which is $590 USD that it is NOT possible. Here, I’ll give you the last 4 digits of my credit card too if you want to bet because it is NOT possible.

Another reason why is because the internet is built in a way where you simply can’t just set a cookie or session on someone else’s domain or website.

Here are A LOT of links that proves why it isn’t possible. Mind you it’s about setting cookies and sessions, but relate similarly to the topic at hand which is allowing tracking on sites that you don’t own.

https://www.en.advertisercommunity.com/t5/Google-Analytics-Account-Access/Creating-An-Account-To-Track-Sites-I-Don-t-Own/td-p/511895

https://www.quora.com/How-do-I-check-the-webpage-that-I-dont-own-through-Google-Analytics-Is-there-any-other-alternative

Basically saying, if you DON’T own those sites, it is IMPOSSIBLE to do what you are suggesting without asking permission from those sites.

Topic		Replies	Views
cURL Experiments PHP	244	21314	July 29, 2017
Frameset in internal file HTML & CSS	27	2578	February 10, 2015
Preg_replace() Delimiter problem PHP	21	13553	June 29, 2017
Untraceable Web Visits Query General Web Dev	21	9944	November 24, 2021
How To Filter Content Before Loading On Screen? PHP	17	3392	August 4, 2017

What Is This Php Method Called For Tracking Links On Foreign Domains?

It violates the internet laws

NOT

Related topics