SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 35
  1. #1
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Database Design for Blog Site

    I've created a couple blog-style sites using ASP/ASP.NET and Access databases, and I'm coming up on another one. However, I want to have a lot more features in this one, and I'm rethinking my database design.

    In all my old blog sites, there would be one table for blog entries, and one table for comments. In my new site, I want to support entries, comments, and trackback pings. I also want to have categories on my entries.

    What I'm wondering is, should I have a separate table for blog entries, comments, and trackbacks, or should I have one table for everything (with a field marking which is which) or should I have one table for entries, and one for comments (with trackbacks being simply comments with the word "Trackback: " appended to the comment body?
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    you can easily combine them all into one table (assuming a "halftrack ding" is sort of like a comment on an entry)

    you need a primary key (autonumber will do nicely) and a foreign key which references the primary key in the same table and which can be NULL

    entries will have NULL in the FK, comments and trackbacks will link to either the starting entry or the entry/comment they're commenting/pinging on
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Do you think that that arrangement will have benefits over a two-table one? On one hand, with two tables some queries like "number of comments" and "last user to comment" can be hairy, but when they're in one table, columns like "category" won't apply to comments/trackbacks.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    since you'll probably want (if not now, later) to assign more than one category to an entry, that won't matter because the entry-category relationship should be in a separate association table

    yes, a single table is easier

    example: find an entry or comment containing the phrase "nice site"
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, I would go for

    blog -- table with blog entries
    commenttype -- table containing comment types such as comment, trackback, support entry
    comment -- table with two foreign keys, one to blog and one to comment type

    Nice and normalized.

  6. #6
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks, great advice.

    asterix - So when I want to display an article, and then all of its comments, and their comment type, how would I build that query?

    I started to think one out in my head, but I don't know how to structure the JOINs (a left outer join somewhere, am I correct?) and select the article and its comments all in one select.

    Here's a start (probably dosen't work):

    SELECT blog.subject, blog.postDate, blog.body, commenttype.name FROM blog LEFT JOIN comment LEFT JOIN commenttype ON commenttype.id = comment.type ON comment.id = blog.id WHERE blog.id = @id OR blog.parentid = @id SORT BY blog.postDate

    I'd also like to be able to find the number of comments for a particular entry in the same query (so I can print it out below the article) but perhaps that's asking too much.

    Thanks for all the help, I've been doing this forever but I'm still a newb.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    how to select an entry and all of its comments if there's only one table --

    .. where coalesce(parentid,id) = $id

    finding the number of comments while at the same time returning all the comments is (a) difficult unless you can use subqueries (version 4.1), and (b) better done in your scripting language if you're gonna return the comments anyway
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    Non-Member QiSoftware's Avatar
    Join Date
    Oct 2004
    Location
    U.S.
    Posts
    366
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    What makes a database really worthwhile is the ability to quickly access "linked" data. Linked data here would be the associations of the main table with other tables that have associated data. Making everything one table is not always the best solution and in the case of a blog, where a lot of textual information is stored is probably not a good idea.

    Q...

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    what??!? i respectfully disagree

    what exactly you find "not a good idea" about a single table?

    and how exactly would you handle the situation where you want to categorize both the main table and related tables with a many-to-many relationship to a categories table?

    yikes!!!
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    what exactly you find "not a good idea" about a single table?
    Because I just love getting into a theory argument with r937 (it is always an opportunity to learn), I will explain why I don't like single tables.

    1) Relational databases are made to make the storage of data entities, and their relations with other entities, as rational, minimal and simple as possible.

    Relations should nearly always be made explicit - by using a relationship table.

    2) Unless you are very hot at SQL, and have a database which supports subqueries and / or cursor based processing, you will have a hell of a time travelling a self referential hierarchy.

    3) If you are doing things properly you are most likely using a natural key as your primary key, that is a key which is one or several of the attributes of the entity. Let's say your PK is composed of 4 fields: firstname, lastname, DOB and social security number. To self reference you need to duplicate all four field definitions in your table definition: this is bloated and unwieldly

    4) With nearly every relationship you can choose to use the relation for relationship method or the posted attribute method to represent this. Posting the attribute to the same table breaks this, since it is not actually posted but duplicated. I find that ugly.

    OK, that's why I hardly ever use self referencing tables. I know I should use them more often, but I actively dislike them. It's like when I look at software written in C, and see that pointers to pointers are being used to bitshift or bit rotate. Useful, very rarely necessary, and almost always impossible to maintain.

    Just my thoughts.

  11. #11
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dalangalma
    asterix - So when I want to display an article, and then all of its comments, and their comment type, how would I build that query?
    No left outer joins.

    Code:
     SELECT 
     	 B.title, 
     	 B.subject, 
     	 C.comment, 
     	 CT.commenttype
     FROM 
     	 blog 
     		  AS B, 
     	 comment 
     		  AS C, 
     	 commenttype 
     		  AS CT
     where
     	 B.blogid = C.blogid
     		  and
     	 C.commenttypeid = CT.commenttypeid

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by asterix
    1) Relational databases are made to make the storage of data entities, and their relations with other entities, as rational, minimal and simple as possible.

    Relations should nearly always be made explicit - by using a relationship table.
    nearly being the operative word here

    for example, it is silly to place a one-to-many relationship into a separate table (a foreign key in the child works really well), unless you are a big fan of sixth normal form, which itself is silly

    Quote Originally Posted by asterix
    2) Unless you are very hot at SQL, and have a database which supports subqueries and / or cursor based processing, you will have a hell of a time travelling a self referential hierarchy.
    true, as far as it goes, but...

    - this is no argument for not having a single hierarchical table, and in fact it actually increases the complexity when you have multiple tables

    - who uses a database that doesn't have subqueries?

    - cursors??!!! wash your mouth out with soap!!

    - the best way to traverse a hiearchy is with a multiple self-join

    Quote Originally Posted by asterix
    3) If you are doing things properly you are most likely using a natural key as your primary key...To self reference you need to duplicate all four field definitions in your table definition: this is bloated and unwieldly
    agreed, and while i prefer to use natural keys, i also know when to use a surrogate

    Quote Originally Posted by asterix
    4) With nearly every relationship you can choose to use the relation for relationship method or the posted attribute method to represent this. Posting the attribute to the same table breaks this, since it is not actually posted but duplicated. I find that ugly.
    huh??
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    nearly being the operative word here

    for example, it is silly to place a one-to-many relationship into a separate table (a foreign key in the child works really well), unless you are a big fan of sixth normal form, which itself is silly
    agreed

    Quote Originally Posted by r937
    true, as far as it goes, but...

    - this is no argument for not having a single hierarchical table, and in fact it actually increases the complexity when you have multiple tables
    I disagree. Multiple tables is good, very good. It makes the intention explicit. Ever looked at somones database design and puzzled, until you realized that key design elements were hidden away in self joins?

    Quote Originally Posted by r937
    - who uses a database that doesn't have subqueries?
    Users of mSQL and MySQL 3.x

    Quote Originally Posted by r937
    - cursors??!!! wash your mouth out with soap!!
    You are a purist. Some things need cursors, just like some beers need drinking.

    Quote Originally Posted by r937
    - the best way to traverse a hiearchy is with a multiple self-join
    And how many multiples of the self-join are you going to have if the depth of the hierarchy is unlimited?

    Quote Originally Posted by r937
    agreed, and while i prefer to use natural keys, i also know when to use a surrogate
    Heretic.

    Quote Originally Posted by r937
    huh??
    Posted attribute:

    products
    ----------
    productid
    name

    descriptions
    ----------
    descriptionid
    description
    productid

    relation for relationship


    products
    ----------
    productid
    name

    descriptions
    ----------
    descriptionid
    description

    productdescriptions
    --------
    descriptionid
    productid

    You could call these identifying and non-identifying too.

    Anyway, my point was that this:

    posting
    -------
    postid
    threadid
    text
    postid_of_parent
    foreign key
    postid_of_parent references posting.postid

    is using neither posted attribute nor relation for relationship and is unaesthetic.

  14. #14
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by asterix
    Users of mSQL and MySQL 3.x
    ... should upgrade, plain and simple

    Quote Originally Posted by asterix
    And how many multiples of the self-join are you going to have if the depth of the hierarchy is unlimited?
    only up to 8 -- but that's just me, i'm very generous, most information architects would say to show no more than three

    can you please show me a page on the web where they actually display more than 8 levels deep

    note: not where there exist more than 8 levels, but where you can actually see more than 8 on one page

    the defence rests, your honour

    Quote Originally Posted by asterix
    You could call these identifying and non-identifying too.
    no, i call them one-to-many and many-to-many, because that's essentially the difference

    an identifying relationship is one where the foreign key in the child is also (part of) the child's primary key
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  15. #15
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    only up to 8 -- but that's just me, i'm very generous, most information architects would say to show no more than three
    In my experience "information architects" know nothing whatsoever about architecture (of any kind) and even less about the meaning of the noun "information".

    Quote Originally Posted by r937
    can you please show me a page on the web where they actually display more than 8 levels deep
    I wasn't talking necessarily about website architecture. Imagine you are logging an interactive chat, and each log entry contains the PK of the user's previous entry. Soon you get more than 8 levels.

    Quote Originally Posted by r937
    an identifying relationship is one where the foreign key in the child is also (part of) the child's primary key
    Err, exactly, here each fields is a foreign key, and both fields together comprise the primary key. Therefore "identifying"....

    productdescriptions
    --------
    descriptionid
    productid

  16. #16
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Anyway, I pronounce myself clearly the winner of this debate, now I am going to bed.

  17. #17
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    you insult an entire profession (which includes me) and then withdraw?

    i shall wait a few days for you to show me a page with even 5 levels visible, sir

    until then, sir, clearly you have lost this round

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  18. #18
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah, this is fantastic. Hehe.

    So, perhaps I'm confused. I looked at asterix's SQL, and I'm not quite sure how that will return what I'm looking for. Perhaps I have his whole table structure messed up, since he seems to be arguing for a separate table for blog entries and their comments. I'm sorry, I'm having a little trouble following you guys.

    The query I'm trying to write here, assuming all blog entries and comments in one table, and then a table that relates comments to comment types (one-to-many, right?) and another that relates blog posts to categories (again, one to many) and returns as the first record, the initial article posted, and then the rest of the records being the comments, with the correct category/comment type filled in and everything. When I'm looking at:

    Code:
    SELECT 
     	  B.title, 
     	  B.subject, 
     	  C.comment, 
     	  CT.commenttype
      FROM 
     	  blog 
     		   AS B, 
     	  comment 
     		   AS C, 
     	  commenttype 
     		   AS CT
      where
     	  B.blogid = C.blogid
     		   and
     	  C.commenttypeid = CT.commenttypeid
    I'm confused by what c.comment is. I guess i just don't understand the table structure implied here.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  19. #19
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by dalangalma
    The query I'm trying to write here, assuming all blog entries and comments in one table, and then a table that relates comments to comment types (one-to-many, right?) and another that relates blog posts to categories (again, one to many) and returns as the first record, the initial article posted, and then the rest of the records being the comments, with the correct category/comment type filled in and everything.
    do you have a table layout yet for these tables that you're planning on having?

    i would start building the query on the basic blog/comment table and add the other ones only later

    if you show your table layout, i'll help you with the query
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  20. #20
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No, I don't have my table layout yet... I'm still trying to decide what arrangement would work best. I'm mainly just worried that I'll do things one way, and then when I want something more advanced I'm screwed.

    Based on you and asterix's comments so far, I'm looking at:

    blog table: Contains all the info for a blog entry. Also contains all the comments, with an extra field that references their parent entry if they're comments, or NULL if they're an actual blog entry. Trackbacks go in here too, as a special comment. (fields autoid, subject, body, parentid, maybe more?)

    categories table: Contains the names of all the categories. (fields autoid, catname)

    entryCategories table: Contains the mapping from entries to categories. (fields autoid, entryid, catid)

    What I'd want to do is, in the least queries, get a listing of the main article, the categories it belongs to, and the comments. I think that would get me started towards understanding all this.

    Thanks for your help guys, this is really doing me a world of good.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  21. #21
    SitePoint Guru asterix's Avatar
    Join Date
    Jun 2003
    Posts
    847
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm mainly just worried that I'll do things one way, and then when I want something more advanced I'm screwed.
    Exactly

    That is the whole point of doinf database design, you don't really want to have to refactor your database, because it means that you break your application (in most cases).

    Anyway, why don't you install WordPress and see how they do it? Actually, why don't you just install WordPress and drop the plan of making your own blog software?

  22. #22
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Because writing my own is always more fun. ;-)

    I actually did download and look at .Text, but it was so full of triggers and whatnot that I couldn't tell how it was actually supposed to work.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  23. #23
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK, I looked at WordPress, and they use a separate table for comments and posts (and categories). So maybe I'll do that. They store pretty different types of data there, so it makes sense. I'll probably go their way, since I could see having different stuff in each - for example, storing the commenter's IP / user agent, which I wouldn't need to do for a normal post.
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/

  24. #24
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,347
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    and why wouldn't you want the IP and user agent of the original post?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  25. #25
    SitePoint Evangelist dalangalma's Avatar
    Join Date
    Aug 2002
    Posts
    468
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Um, because it's me?
    Web Design:
    http://www.numbera.com/
    Free Web Design Resources and Software:
    http://brh.numbera.com/


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •