SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Enthusiast
    Join Date
    Jan 2008
    Posts
    27
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    SQL that mimics Google's "Searches related to" SQL

    Since someone helped me on this forum, I thought I would give a little something back. My design might not be the greatest, alternative designs would also be appreciated

    If you do a Google search and notice the links at the bottom "Searches related to", this design try's to do a similar thing:

    Searches related to Design

    "searches related to" - relevant search terms:

    What my design really does, is return the most common queries that other users were searching (1500 seconds before and after) at the time they searched the same term q

    Design:
    $q = php to get the users query (make sure you have made this safe)
    $ip = php to get the users ip (also, make safe)


    Log all queries that users make, log the query term (query), log the users ip address (ip) and log the timestamp (ts). I also chuck away duplicates queried (q) from the same person (same ip) within the same hour (ts > ( NOW() -3600 )

    Code:
    "INSERT INTO searchqueries(query, ts, IP)". 
    		" SELECT '".$q."', NOW(), '".$ip."'". 
    		" FROM DUAL". 
    		" WHERE NOT EXISTS".
    		" (SELECT query, ts, IP FROM searchqueries WHERE".
    		" (query = '".$q."' AND ts > ( NOW() -3600 ) AND IP = '".$ip."')". 
    		" ) LIMIT 1";
    Next, when a user searches for terms q, find all the ip addresses that have searched the same q:

    ::SQL1:: select ip as q_ip, ts as q_ts from searchqueries where query = ".$q.";

    for each of these ips get all of the queries (around the hour at the time the user queried q) for each q_ip (and q_ts)

    ::SQL2:: select query as relatedqueries from searchqueries where ip = q_ip and (ts > (q_ts - 1500)) and (ts < (q_ts + 1500))


    combining the ideas of SQL1and SQL2
    ::SQL1+2:: select rq.query as relatedqueries, q_ip, q_ts from searchqueries rq, (select ip as q_ip, ts as q_ts from searchqueries where query = "Your Query") as t where ip = q_ip and (ts > (q_ts - 1500)) and (ts < (q_ts + 1500))

    now we need to count all of the relatedqueries, order by the most common then return only the top 11

    ::SQL3:: SELECT relatedqueries, count(relatedqueries) as c FROM ::SQL1+2:: group by query order by c desc, Limit 0, 11

    now combine the queries into a another single SQL statement


    Code:
    SELECT relatedqueries, count( relatedqueries ) AS c
    FROM (
    		SELECT rq.query AS relatedqueries, q_ip, q_ts
    		FROM searchqueries rq, (
    			SELECT ip AS q_ip, ts AS q_ts
    			FROM searchqueries
    			WHERE query = '".$q."'
    		) AS t
    		WHERE ip = q_ip
    		AND (
    			ts > ( q_ts -8500 )
    		)
    		AND (
    			ts < ( q_ts +8500 )
    		)
    	) AS x
    GROUP BY relatedqueries
    ORDER BY c DESC
    LIMIT 0 , 11

    You might want to check the returning matches, and chuck away the one that is an exact match of q (unless you want it)
    So we now have 2 SQL statements above, 1 to insert the users query, another to find the relevant matches

    You will now have a list (potentially up to 11 matches) of "searches related to", or "searches that other people were searching when also searching this query". This design should get more accurate the more people use it, the beauty is that you don't have to put in your own data (people are making the related queries every time they query more than once within the same couple of hours).

    Obviously, you can go into the DB and tweak things / add rows to begin with, but I haven't found much of a need to do this so far, it seems to be working well. Something I might add is a spell checker, and then dont add and any wrongly spelt queries / profanities (but this will be done with server side php code)

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,021
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    tl;dr

    sorry, but that's how it is on a sunday afternoon
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Enthusiast
    Join Date
    Jan 2008
    Posts
    27
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yup, I guess there's a bit much there for a single post

    A quick example is the "Related Search Terms" found on my search page:
    rightsfortenants.co.uk/search.php?q=Rats
    (not a live link)

    - All these related links were added by the people searching for Rats (and anything else they may have been searching at that time) without them knowing
    - The above SQL shows you how to make the "Related Search Terms" section (and the data creates its self from other peoples search queries). Everything you need to know SQL side for this code is in the above post.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •