SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 44
  1. #1
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Different Duplicate Issue

    I wasn't thinking when designing the database and forgot to apply UNIQUE to a couple fields (field1, field2). Of course, after a couple thousand records it occurs to me and I want to fix it. However, trying to add UNIQUE at this point gives me the old #1062 duplicate entry error.

    The problem is, that some of the records have no value in these fields, and some do, so this query isn't doing the trick.

    Code MySQL:
    SELECT field1, COUNT(*)
    FROM table
    GROUP BY field1
    HAVING COUNT(*) > 1

    How do I find the records that have the duplicate(s)?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by tgavin View Post
    How do I find the records that have the duplicate(s)?
    Code:
    SELECT *
      FROM daTable
     WHERE field1 IN
           ( SELECT field1
               FROM daTable
             GROUP 
                 BY field1
             HAVING COUNT(*) > 1 )
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thank you. However, that showed all of the records with the empty field. Just the opposite of what I'm looking for.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    sorry, i thought it would show all the rows with duplicates

    are the empty fields actually empty, or are they NULL?

    so you're interested in the non-empty and/or non-NULL duplicates only?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    are the empty fields actually empty, or are they NULL?
    The row is NOT NULL, and nothing was inserted, so they should be empty.

    Quote Originally Posted by r937 View Post
    so you're interested in the non-empty and/or non-NULL duplicates only?
    Yes.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Code:
    SELECT *
      FROM daTable
     WHERE field1 IN
           ( SELECT field1
               FROM daTable
            WHERE field1 <> ''
             GROUP 
                 BY field1
             HAVING COUNT(*) > 1 )
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Wonderful! So, I ran that on both fields and had zero results. I'm assuming that there are no duplicates. Correct? If that's the case, why can't I convert those fields to UNIQUE without receiving an duplicate entry error?

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    make a copy of the table

    on the copy, use ALTER IGNORE TABLE ADD UNIQUE

    this will remove the duplicates

    now compare the copy to your original

    this will highlight all the rows that were removed because of duplicates

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ALTER IGNORE TABLE copy ADD UNIQUE KEY (field1)

    removed 2 rows. What's the best way to compare the two tables to find out which ones?

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Code:
    SELECT live.*
      FROM live
    LEFT OUTER
      JOIN copy
        ON copy.pkey = live.pkey
     WHERE copy.pkey IS NULL
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    SELECT live.*
    FROM live
    LEFT OUTER
    JOIN copy
    ON copy.uid = live.uid
    WHERE copy.uid IS NULL

    That was kind of the reverse of what I was looking for. It showed me all of the users where field1 is empty. I tried IS NOT NULL and received 304 users. Not the 2 that were omitted.


    Tried this too...
    SELECT a.uid
    FROM live a
    INNER JOIN copy b ON a.uid = b.uid
    WHERE a.field1 <> b.field1

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    the sql i gave you is bulletproof -- you just have to use the right columns on the join

    what are the real names of your table's primary key, and the columns that you wanted the UNIQUE applied to?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    FYI in mysql you can use the first three of these 5 ways -- http://sqlblog.com/blogs/denis_gobo/...2/20/5169.aspx

    the OUTER JOIN is the one i suggested
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  14. #14
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here's something 'funny'. I exported the fields to a text file and then opened it in BBEdit. I ran the 'process duplicate lines' filter on the data and came up with nothing. Do I even have duplicates? It doesn't look that way now, but I'm not sure because mysql won't let me change the fields to UNIQUE.

    Here's the table

    CREATE TABLE IF NOT EXISTS `users` (
    `uid` bigint(20) unsigned NOT NULL auto_increment,
    `token` varchar(32) collate utf8_unicode_ci NOT NULL default '',
    `key` varchar(32) collate utf8_unicode_ci NOT NULL default '',
    PRIMARY KEY (`uid`),
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=2008 ;

    The fields 'token' and 'key' are the two that I want to make UNIQUE.

  15. #15
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    KEY for sure is a reserved word, TOKEN probably isn't

    if you had mentioned the real column names at the beginning, instead of using the rather lame -- no offence -- "field1", then we could have saved ourselves a lot of time

    did you want separate UNIQUE keys on `token` and `key`, or a composite?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  16. #16
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Sorry! In my haste to get you the table name I did a find/replace in usr_ and it removed those from the field names! I need to slow down...

    The correct field names are
    uid
    usr_token
    usr_key

    usr_token and usr_key both need to be UNIQUE.

  17. #17
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by tgavin View Post
    usr_token and usr_key both need to be UNIQUE.
    individually or together as a composite key?
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  18. #18
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Individually please. I have no idea what a composite key is.

  19. #19
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    could you please do an actual SHOW CREATE TABLE for me, thanks
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  20. #20
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    CREATE TABLE IF NOT EXISTS `users` (
    `uid` bigint(20) unsigned NOT NULL auto_increment,
    `usr_fname` varchar(30) collate utf8_unicode_ci NOT NULL default '',
    `usr_lname` varchar(30) collate utf8_unicode_ci NOT NULL default '',
    `usr_email` varchar(60) collate utf8_unicode_ci NOT NULL default '',
    `usr_uname` varchar(60) collate utf8_unicode_ci NOT NULL default '',
    `usr_passwd` varchar(32) collate utf8_unicode_ci NOT NULL default '',
    `usr_website` varchar(255) collate utf8_unicode_ci NOT NULL default '',
    `usr_ip_address` varchar(60) collate utf8_unicode_ci NOT NULL default '',
    `usr_token` varchar(32) collate utf8_unicode_ci NOT NULL default '',
    `usr_key` varchar(32) collate utf8_unicode_ci NOT NULL default '',
    `usr_verified` char(1) collate utf8_unicode_ci NOT NULL default '0',
    `usr_access_level` char(1) collate utf8_unicode_ci NOT NULL default 'u',
    `usr_date_added` datetime NOT NULL default '0000-00-00 00:00:00',
    PRIMARY KEY  (`uid`),
    UNIQUE KEY `usr_uname` (`usr_uname`),
    UNIQUE KEY `usr_email` (`usr_email`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=2008 ;

  21. #21
    SitePoint Wizard guelphdad's Avatar
    Join Date
    Oct 2003
    Location
    St. Catharines, ON Canada
    Posts
    1,706
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    a composite key is where the key spans more than one column.
    if you had separate unique keys on firstname and lastname columns you could only have one Ken or one Lake but if the unique key was on both columns you could have Ken Brown and Ken Lake for instance, which wouldn't be allowed if the unique key was on firstname alone.

  22. #22
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    aha, it seems you have no NULLs, but might have a lot of "empty" fields

    if there is more than one, UNIQUE will not work

    however, if they were NULL, then you could have as many NULLs in the UNIQUE column as you wish


    what was it you were trying to do originally again?

    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  23. #23
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    what was it you were trying to do originally again?


    When designing the users table, I failed to make usr_token and usr_key UNIQUE. I now want to do that. However, I have over 2,000 rows in the users table. When trying to convert the fields to UNIQUE I receive the mysql #1062 duplicate entry error.

    I want to make usr_token and usr_key UNIQUE without destroying records.

  24. #24
    SitePoint Wizard tgavin's Avatar
    Join Date
    Feb 2003
    Location
    FL
    Posts
    1,051
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guelphdad View Post
    a composite key is where the key spans more than one column.
    if you had separate unique keys on firstname and lastname columns you could only have one Ken or one Lake but if the unique key was on both columns you could have Ken Brown and Ken Lake for instance, which wouldn't be allowed if the unique key was on firstname alone.
    Thank you.

  25. #25
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,016
    Mentioned
    53 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by tgavin View Post
    I want to make usr_token and usr_key UNIQUE without destroying records.
    this will be somewhat of a problem -- what if two rows have the same value? one of them will need to change, or, alternatively, to be deleted

    you gots to destroy something

    please tell me what you get for this query --
    Code:
    SELECT usr_token
         , COUNT(*) AS dupes
      FROM users
    GROUP
        BY usr_token
    HAVING COUNT(*) > 1
    ORDER
        BY dupes DESC LIMIT 1
    r937.com | rudy.ca | Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •