SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Addict mcrumlish's Avatar
    Join Date
    Jan 2002
    Posts
    384
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    problem eliminating duplicates from mySQL table

    Hi,

    I have a project whereby I have been fiven 2 CSV files, each with 1 column. These CSV files are email addresses. I have loaded both of these into a table and I thought to get the uniques I would just need to use "SELECT DISTINCT email FROM theTable"

    There are 91643 rows of data, after I do the SELECT DISTINCT I still get 91643 rows returned, even though there are definitel duplicate entries.

    Does anyone know an easy way I can solve this?

    Thanks,
    Martin

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,260
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    how many rows were in each of the two csv files?

    what was the entire sql statement you used?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict mcrumlish's Avatar
    Join Date
    Jan 2002
    Posts
    384
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    The 1st CSV had approx 26108 rows, the second had 65535

    The aim of this is to create a 3rd CSV where any of the email addresses that already exist in the 1st CSV are removed from the 2nd CSV.

    The query I used was just:

    "SELECT DISTINCT email FROM data1"

    The database is mySQL version 3.23.49

    Thanks,
    Martin

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,260
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    hmmm, weird

    i'm pretty sure SELECT DISTINCT works

    there's gotta be something about your data that's causing it to make you think the emails are all unique

    do this:
    Code:
    select count(*) as rows
         , count(distinct email) as emails
      from yourtable
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •