SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Member
    Join Date
    Jan 2006
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Cool Help with complex SELECT query on duplicates

    Please need help with following problem. In my table I have entries such as:

    initial name address postcode
    I SMITH Frogs Lane TN52 2RE
    J SMITH Frogs House TN52 2RE
    T SMITH Rose Cottage CT23 5DE
    F JONES Tally Ho Road TN52 2RE


    Firstly, I need to count how many entries there are with the same name AND the same postcode. For the moment I can only count the number of entries with the same name.

    $sql = mysql_query("select name, count(*) as Num from profile group by name having count(*) > 1");

    Secondly, I need to select entries with the same name AND the same postcode resulting in:

    I SMITH Frogs Lane TN52 2RE
    J SMITH Frogs House TN52 2RE

    This selects only duplicates with the same name:

    $sql="SELECT id, title, add_1, add_2, add_3, add_4, tel, tel2, email, mail, emailing, pc, initial, profile.name FROM profile INNER JOIN (SELECT name FROM profile GROUP BY name HAVING COUNT(id) > 1) dup ON profile.name = dup.name ORDER BY name ASC";

    Would very much appreciate someone's help.

  2. #2
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    Let's put these queries in a more readable format:

    Query 1:
    Code:
    SELECT 
        name
      , COUNT(*) AS Num 
    FROM profile 
    GROUP BY name 
    HAVING count(*) > 1
    Query 2:
    Code:
    SELECT 
        id
      , title
      , add_1
      , add_2
      , add_3
      , add_4
      , tel
      , tel2
      , email 
      , mail
      , emailing
      , pc
      , initial
      , profile.name 
    FROM profile 
    INNER JOIN 
      (SELECT 
           name 
       FROM profile 
       GROUP BY name 
       HAVING COUNT(id) > 1
    ) AS dup 
    ON profile.name = dup.name 
    ORDER BY name ASC
    If you look closely at these two queries, you'll see that resolving the first one, you'll also have resolved the second (the subquery in the second query = the first query).

    Do you understand how the GROUP BY works? Why does query 1 give you the duplicate names?

  3. #3
    SitePoint Enthusiast
    Join Date
    Jul 2010
    Posts
    42
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi rvlach,
    you can do like given code
    SELECT * FROM table_name group by name having postcode like postcode
    hope that it will help you

  4. #4
    SitePoint Member
    Join Date
    Jan 2006
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes I'm no pro when it comes to more complex queries other than the simple SELECT, UPDATE and DELET and I do not really understand how GROUP BY works.

    This is a sample code I found searching the web and works for counting and selecting duplicates on the name only but this may probably be done some other way.

  5. #5
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    As the name says, it groups rows based on column values, and then you can do certain operations (like SUM, COUNT, MIN, MAX) on other columns (MySQL :: MySQL 5.0 Reference Manual :: 11.15.1 GROUP BY (Aggregate) Functions).

    So in the case of your query 1, grouping by Name, and using the function COUNT(*) what you get is for each name the number of rows found with that name.

    Combining this with the HAVING clause, in the end you'll extract only those names that occur more than once in the table.

    If you want to do the same for duplicates of Name+postcode, then all you have to do is add the postcode column to the SELECT and GROUP BY:
    Code:
    SELECT 
        name
      , postcode
      , COUNT(*) AS Num 
    FROM profile 
    GROUP BY 
        name 
      , postcode
    HAVING count(*) > 1
    This will group all rows with the same name+postcode, count the number of rows for each combination, and return only those name+postcode combinations that are present in more than 1 row.

  6. #6
    SitePoint Member
    Join Date
    Jan 2006
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    guido,

    Thanks for your help . It all seems so easy when explained correctly. All records are displaying as they should.

  7. #7
    SitePoint Member
    Join Date
    Jan 2006
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Guido,

    I thought it was solved but looking again I see that it's not quite what is required.

    For the moment for each duplicate I have only have one record displayed. But I need all duplicates to be displayed as a manual analysis has to be made to determine whether to link the records together as they may be same family members so purchases made can be grouped under one entry. For example I need to display both

    I SMITH Frogs Lane TN52 2RE
    J SMITH Frogs Lane TN52 2RE

    as these may be 2 families living at the same property. Unfortunately no code can do this for us as only we can determine that.

    How would I go about that.

  8. #8
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,495
    Mentioned
    161 Post(s)
    Tagged
    4 Thread(s)
    That's where you have to modify query 2:
    Code:
    SELECT 
        id
      , title
      , add_1
      , add_2
      , add_3
      , add_4
      , tel
      , tel2
      , email 
      , mail
      , emailing
      , pc
      , initial
      , profile.name 
      , profile.postcode
    FROM profile 
    INNER JOIN 
      (SELECT 
           name
         , postcode 
       FROM profile 
       GROUP BY 
           name 
         , postcode
       HAVING COUNT(id) > 1
    ) AS dup 
    ON  profile.name = dup.name 
    AND profile.postcode = dup.postcode 
    ORDER BY name ASC

  9. #9
    SitePoint Member
    Join Date
    Jan 2006
    Posts
    13
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Brilliant. Thanks Guido. All is as it should be now.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •