SitePoint Sponsor

User Tag List

Page 2 of 3 FirstFirst 123 LastLast
Results 26 to 50 of 57
  1. #26
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    not primary keys (plural), because a table can have only one primary key -- but it can be a composite key, consisting of more than one column
    Ah, OK.

    I tried bookmark_id and tag in the tags table, but got an error:
    #1170 - BLOB/TEXT column 'tag' used in key specification without a key length
    Incidentally, tag is already an index of type FULLTEXT; used in the search function of the website.

    BTW I bought your book "Simply SQL". I thought it was the least I could do, given all of the wonderful help you've provided me with, not just here, but at least two or three times previously.

  2. #27
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    i'm not sure what kind of tags you anticipate having, but i will guess that the longest one is not going to reach 100 characters, never mind the 65K that a TEXT column reserves

    and i doubt fulltext searching is necessary
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #28
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    i'm not sure what kind of tags you anticipate having, but i will guess that the longest one is not going to reach 100 characters, never mind the 65K that a TEXT column reserves

    and i doubt fulltext searching is necessary
    I don't want to impose limits on tag length as I'm using comma-delimited tagging, not the silly space-delimited, single word style.

    So in some cases, the tags are much longer that 100 characters.

  4. #29
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Forbes View Post
    ...I'm using comma-delimited tagging.
    bad move, sorry to say

    and fulltext indexing of comma-delimited strings is definitely wrong

    you need to redesign it so that each tag on a particular bookmark gets its own row, so that the "tag" column contains only one word (or phrase, if you allow phrases as tags), and a bookmark has as many rows in the tags table as there are tags on that bookmark

    basic one-to-many design
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #30
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    see where i did this in the query --
    Code:
    LEFT OUTER
      JOIN ( SELECT bookmark_id
                  , GROUP_CONCAT(tag) AS tags
               FROM tags
             GROUP
                 BY bookmark_id ) AS t
        ON t.bookmark_id = bookmarks.id
    that join was predicated on the assumption that each tag is in a separate row, and the grouping in the subquery produces the comma-delimited list as a result column

    if you had like all the tags for a bookmark in a TEXT column, there would've been no need to do any grouping of rows
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #31
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    bad move, sorry to say

    and fulltext indexing of comma-delimited strings is definitely wrong

    you need to redesign it so that each tag on a particular bookmark gets its own row, so that the "tag" column contains only one word (or phrase, if you allow phrases as tags), and a bookmark has as many rows in the tags table as there are tags on that bookmark

    basic one-to-many design
    You've misunderstood me — each tag is a completely separate entry in the tags table.

    I'm no SQL expert, but I'm no idiot, either!

    So what I'm saying is (and I should have said this first time around), the user enters their tags as a comma-delimited "flat" list, in a text field. Then, the Under Cloud adds each tag as a unique entry, with the ID of the bookmark they belong to.

  7. #32
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK, I've done some rummaging around within the tags table and the longest tag I can find is well under 100 characters long.

    So, with that in mind, I've changed the tag column type to varchar that's 100 in length.

    Then I tried making the bookmark_id and tag Primary Keys, but I get another error:
    #1062 - Duplicate entry '4167-shapes' for key 1

  8. #33
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Forbes View Post
    You've misunderstood me each tag is a completely separate entry in the tags table.
    my apologies, i did not know about this step --
    "Then, the Under Cloud adds each tag as a unique entry, with the ID of the bookmark they belong to."

    you understand the error message, right?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #34
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    You understand the error message, right?
    You mean error 1062? No.

    There's bound to be gazillions of duplicate entries in the tag column, that's something that can't be avoided.

  10. #35
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    no, you can have the same tag in there as many times as you need it

    however, you cannot put the same tag on the same bookmark more than once

    it's the combinations that are unique

    the error message is telling you that bookmark 4167 already has the 'shapes' tag, so you can't link that bookmark with that tag again
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #36
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    no, you can have the same tag in there as many times as you need it

    however, you cannot put the same tag on the same bookmark more than once

    it's the combinations that are unique

    the error message is telling you that bookmark 4167 already has the 'shapes' tag, so you can't link that bookmark with that tag again
    No, the error is happening when I try creating a Primary Key from the two columns you mention.

  12. #37
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    you cannot create the PK if the table already contains dupes -- remove the dupes and then try creating the PK again
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #38
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    you cannot create the PK if the table already contains dupes -- remove the dupes and then try creating the PK again
    But what do you mean by duplicates; duplicate Primary Key or duplicate tags?

    As I explained, I cannot remove the duplicate tags.

  14. #39
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    duplicate rows where the combination bookmark_id and tag are present more than once
    Code:
    SELECT bookmark_id 
         , tag
      FROM tags
    GROUP
        BY bookmark_id 
         , tag
    HAVING COUNT(*) > 1
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #40
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    duplicate rows where the combination bookmark_id and tag are present more than once
    Code:
    SELECT bookmark_id 
         , tag
      FROM tags
    GROUP
        BY bookmark_id 
         , tag
    HAVING COUNT(*) > 1
    Right, OK. That's done; unnecessary duplicates removed!

    I added bookmark_id and tag as a combined Primary Key (showing as type BTREE), but when I run the home page, the browser still hangs.

    I've attached a screen shot, to illustrate the schema.
    Attached Images Attached Images

  16. #41
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    that screenshot doesn't appear to be from the tags table

    when you "run the home page" what is the exact query you're using?

    also, would you mind doing an EXPLAIN on it
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  17. #42
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    that screenshot doesn't appear to be from the tags table
    The screen shot is from the tags table.

    Quote Originally Posted by r937 View Post
    when you "run the home page" what is the exact query you're using?
    The query is taking almost eighty seconds to perform:

    SELECT bookmarks.id, source.duplicates, bookmarks.url, bookmarks.title, SUBSTR(bookmarks.snippet, 1, 100) as snippet, bookmarks.datetime, t.tags, l.users FROM (SELECT url, COUNT(*) AS duplicates, MIN(datetime) AS earliest FROM bookmarks GROUP BY url) AS source INNER JOIN bookmarks ON (bookmarks.url = source.url) AND (bookmarks.datetime = source.earliest) LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(tag) AS tags FROM tags GROUP BY bookmark_id ) AS t ON t.bookmark_id = bookmarks.id LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(user_id) AS users FROM links WHERE status = 'public' GROUP BY bookmark_id ) AS l ON l.bookmark_id = bookmarks.id ORDER BY bookmarks.datetime DESC LIMIT 0, 9
    Quote Originally Posted by r937 View Post
    also, would you mind doing an EXPLAIN on it
    Having run the query through phpMyAdmin, the users column is returning [BLOB - 1B], and the tags for one of the rows is returning NULL.

    I've attached a screen shot for the EXPLAIN.
    Attached Images Attached Images

  18. #43
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    query looks fine (after i took the trouble to re-format it... why did you not preserve the indents and line breaks?)

    links table looks like it might need same sort of de-construction/re-construction as we did with the tags table (compare lines 5 and 6 of the EXPLAIN)

    bookmarks needs to have its indexes looked at too (see line 7)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  19. #44
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    query looks fine (after i took the trouble to re-format it... why did you not preserve the indents and line breaks?)
    I build all of the queries with conditional statements in PHP, based on various parameters, so there's no need for indentation.

    I just copied straight from the browser window, having asked PHP to echo the statement.

    Quote Originally Posted by r937 View Post
    links table looks like it might need same sort of de-construction/re-construction as we did with the tags table (compare lines 5 and 6 of the EXPLAIN)
    Not knowing much about what EXPLAIN does, I'm guessing it's the "Using filesort" that's slowing the whole thing down?

    Quote Originally Posted by r937 View Post
    bookmarks needs to have its indexes looked at too (see line 7)
    Both links and tags have identical configurations for their indexes, et cetera (see the previous screen shot for an example).

    I've attached a screen shot for the bookmarks table.
    Attached Images Attached Images

  20. #45
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,146
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    Run this directly in MySQL and post the result as an image:

    Code SQL:
    EXPLAIN SELECT bookmarks.id, SOURCE.duplicates, bookmarks.url, bookmarks.title, SUBSTR(bookmarks.snippet, 1, 100) AS snippet, bookmarks.datetime, t.tags, l.users FROM (SELECT url, COUNT(*) AS duplicates, MIN(datetime) AS earliest FROM bookmarks GROUP BY url) AS SOURCE INNER JOIN bookmarks ON (bookmarks.url = SOURCE.url) AND (bookmarks.datetime = SOURCE.earliest) LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(tag) AS tags FROM tags GROUP BY bookmark_id ) AS t ON t.bookmark_id = bookmarks.id LEFT OUTER JOIN ( SELECT bookmark_id, GROUP_CONCAT(user_id) AS users FROM links WHERE STATUS = 'public' GROUP BY bookmark_id ) AS l ON l.bookmark_id = bookmarks.id ORDER BY bookmarks.datetime DESC LIMIT 0, 9
    The only code I hate more than my own is everyone else's.

  21. #46
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    oddz, is that the same query? he posted the EXPLAIN in post #42

    i'm glad you're here, it helps to have another set of eyes on these indexes...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  22. #47
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,146
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    can't believe I missed that.

    You will probably need to add some indexes but this should put you in the right direction. I'm running a little short on time but the main thing is that there were 2 unnecessary subqueries. Also, you will need a index on links: (bookmark_id,status) - if it doesn't exist. You *may* need others but that is the main one I believe. The query is untested but you should get the idea. Suffice to say – always eliminate unnecessary subqueries to achieve a more optimized query – when possible. In this case it so happens that it is.

    Code SQL:
    SELECT 
          b.id
         ,s.duplicates
         ,b.url
         ,b.title
         ,SUBSTR(b.snippet, 1, 100) snippet
         ,b.datetime
         ,GROUP_CONCAT(t.tag) tags
         ,GROUP_CONCAT(u.user_id) users 
      FROM 
         (SELECT 
                url
               ,COUNT(*) AS duplicates
               ,MIN(datetime) AS earliest 
            FROM 
               bookmarks 
           GROUP 
              BY 
               url) AS s 
      INNER 
       JOIN 
          bookmarks b
        ON 
          b.url = s.url
        AND 
          b.datetime = s.earliest 
       LEFT OUTER
       JOIN
          tags t
         ON
          b.id =  t.bookmark_id    
       LEFT OUTER 
       JOIN 
          links l
         ON
          b.id  = l.bookmark_id
        AND
          l.`status` = 'public'
     GROUP
        BY
          b.id
     ORDER 
        BY 
          b.datetime DESC     
     LIMIT 
          0, 9
    The only code I hate more than my own is everyone else's.

  23. #48
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by oddz View Post
    ...there were 2 unnecessary subqueries.
    on the contrary, they are very much needed

    bookmark to tags in one-to-many, and bookmark to links is one-to-many, and you will get cross join effects by joining to both at the same time

    e.g. a bookmark with 3 tags and 4 links will generate 12 rows in the FROM clause, and your GROUP_CONCATs will be all shot to hell (unless you throw DISTINCT into them, but what's the point of that, it's way inefficient)

    whereas joining to derived tables ensures that you join a bookmark to only one row for the tags, and again only one row for the links

    so the subqueries are not unnecessary
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  24. #49
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,146
    Mentioned
    16 Post(s)
    Tagged
    3 Thread(s)
    yeah… you're right.

    What I would recommend is profiling each of those subqueries than to see if better optimization can be achieved on an individual by individual basis. The original explain isn't looking to hot.
    The only code I hate more than my own is everyone else's.

  25. #50
    It's been real... Forbes's Avatar
    Join Date
    Dec 2004
    Location
    Yorkshire, England
    Posts
    676
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by oddz View Post
    can't believe I missed that.

    You will probably need to add some indexes but this should put you in the right direction. I'm running a little short on time but the main thing is that there were 2 unnecessary subqueries. Also, you will need a index on links: (bookmark_id,status) - if it doesn't exist. You *may* need others but that is the main one I believe. The query is untested but you should get the idea. Suffice to say always eliminate unnecessary subqueries to achieve a more optimized query when possible. In this case it so happens that it is.

    Code SQL:
    SELECT 
          b.id
         ,s.duplicates
         ,b.url
         ,b.title
         ,SUBSTR(b.snippet, 1, 100) snippet
         ,b.datetime
         ,GROUP_CONCAT(t.tag) tags
         ,GROUP_CONCAT(u.user_id) users 
      FROM 
         (SELECT 
                url
               ,COUNT(*) AS duplicates
               ,MIN(datetime) AS earliest 
            FROM 
               bookmarks 
           GROUP 
              BY 
               url) AS s 
      INNER 
       JOIN 
          bookmarks b
        ON 
          b.url = s.url
        AND 
          b.datetime = s.earliest 
       LEFT OUTER
       JOIN
          tags t
         ON
          b.id =  t.bookmark_id    
       LEFT OUTER 
       JOIN 
          links l
         ON
          b.id  = l.bookmark_id
        AND
          l.`status` = 'public'
     GROUP
        BY
          b.id
     ORDER 
        BY 
          b.datetime DESC     
     LIMIT 
          0, 9
    Hi and thanks for your time!

    The query returned with an error:

    #1054 - Unknown column 'u.user_id' in 'field list'


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •