SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 47 of 47
  1. #26
    SitePoint Addict chestertondevelopment's Avatar
    Join Date
    Dec 2005
    Location
    Essex, UK
    Posts
    241
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by oranjcrush
    Granted. By 'best' are we talking about best pratice or best performance?
    Best Practice: SELECT count(*) etc.
    Best Performance: Extra column with number of replies

    I agree that in this case performance benefits are too great to be dismissed

  2. #27
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by oranjcrush View Post
    Granted. By 'best' are we talking about best pratice or best performance?
    Both. Cleanest, purest and best organized code usually tends to be most performant as well.

    Honestly, this "stored counter" debate doesn't make much sense to me without knowing exact project details (database size, site throughput, hardware etc). From what OP has said (he plans to create a new forum script from scratch) I think here we have to do with the typical case of "premature optimization", i.e. a desperate attempt to solve the problem that does not even exist.

  3. #28
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stereofrog View Post
    Both. Cleanest, purest and best organized code usually tends to be most performant as well.

    Honestly, this "stored counter" debate doesn't make much sense to me without knowing exact project details (database size, site throughput, hardware etc). From what OP has said (he plans to create a new forum script from scratch) I think here we have to do with the typical case of "premature optimization", i.e. a desperate attempt to solve the problem that does not even exist.
    There's a reason all forums do it this way -- it doesn't take a lot of concurrency for a forum to start hitting performance issues. It's just the sheer amount of database hits a typical forum page involves.

  4. #29
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dan Grossman View Post
    There's a reason all forums do it this way -- it doesn't take a lot of concurrency for a forum to start hitting performance issues. It's just the sheer amount of database hits a typical forum page involves.
    Wouldn't it be a better strategy, to cache the entire pages at first request, and then clear the cache, when someone submits a post?

  5. #30
    Follow Me On Twitter: @djg gold trophysilver trophybronze trophy Dan Grossman's Avatar
    Join Date
    Aug 2000
    Location
    Philadephia, PA
    Posts
    20,578
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken View Post
    Wouldn't it be a better strategy, to cache the entire pages at first request, and then clear the cache, when someone submits a post?
    Not if you want to have the flexibility every major forum software provides. These pages look different for every user. They show what threads we've viewed already, which we're subscribed to, are ordered depending on our settings, with the number of threads per page we choose, showing which other members are online right now...

    Why don't we ask this... why are you writing a forum from scratch when so many exist already?

  6. #31
    SitePoint Enthusiast
    Join Date
    Mar 2007
    Posts
    89
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    maybe for the experience of building one yourself? with all your own personal touches. and to learn how it's done..

  7. #32
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    > ...to cache the entire pages at first request, and then clear the cache, when someone
    > submits a post?

    This wouldn't be pratical due to the very nature of what a forum is; What you would be doing is pushing the performance degregation from the database to the file system. I wouldn't want to do that, considering the performance lose would be greater on the file system than it would be on the database.

    Also, some OS have an issue in regards to the amount of files in a given directory. Hell, it's been known for an OS to completely crash under the weight.

  8. #33
    SitePoint Guru
    Join Date
    Jul 2004
    Location
    Netherlands
    Posts
    672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken View Post
    That said, count() may not be efficient under all circumstances, and then you could apply something like Dan Grossman suggested. You just shouldn't do it, until you have actually identified a performance bottleneck.

    I agree with this, untill this actually becomes a problem use count and don't worry about it. Under the motto just build it damn it!
    Go visit my site :-D you know you want to ;-)
    www.mech7.net

  9. #34
    SitePoint Addict
    Join Date
    Jan 2005
    Posts
    204
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by kyberfabrikken View Post
    Wouldn't it be a better strategy, to cache the entire pages at first request, and then clear the cache, when someone submits a post?
    i have this idea to cache the data as well though i didn't dare to implement yet. in fact, i thought i could do this for every single threads and posts from the database.

    from my knowledge, fetching data from XML is much faster than getting from the database.(or not?) so if we were to do it in such a way where when the user posts a thread or replies to a thread, we insert the data into the database, then update the thread's XML file. and when readers enters the page to read the thread, the page will fetch the data from the XML file instead of getting data directly from the database. in this case, we don have to go through connecting the database.

    i am not sure if this is a practical solution. it will definitely be an extra thing to do since we are actually storing one set of data in the db, and another set as XML files. however, if there is plans for AJAX later, this could also be much easier also.

    what do you think?

  10. #35
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by x[E]nOn View Post
    from my knowledge, fetching data from XML is much faster than getting from the database.(or not?)
    not

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #36
    SitePoint Addict Jasper Bekkers's Avatar
    Join Date
    May 2007
    Location
    The Netherlands
    Posts
    282
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    not

    Of couse, it depends

  12. #37
    SitePoint Addict
    Join Date
    Jan 2005
    Posts
    204
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    in what way will a XML be faster than getting from a database? if XML is slower, then why would we even want to get things from XML when many things can just get directly from the database unless the data is need to pass to somewhere else?

  13. #38
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    XML is a convenient format for the transmission of data

    for storage and retrieval of data, you'll want to use a database
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  14. #39
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by r937 View Post
    XML is a convenient format for the transmission of data

    for storage and retrieval of data, you'll want to use a database
    That seems like rather an absolute statement, and I largely agree with it, but:

    What if what you want the xml to be sliced and diced into a number of other formats? xHTML, pdf, another Xml, etc.

    Perhaps this is the answer to "why would anyone store xml in a database?".

    I am thinking of essays containing complex intra-links and relationships and "intelligence" of context rather than phone-book-like apps, or forums actually (admitting I am veering OT ... ).

  15. #40
    SitePoint Addict Jasper Bekkers's Avatar
    Join Date
    May 2007
    Location
    The Netherlands
    Posts
    282
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    XML is a convenient format for the transmission of data
    Hardly, XML documents are almost as redundant as it can get, especially for transmission with limited bandwidth such as the web. This can, of course, be circumvented using a form of Huffman encoding but that defeats the purpose of readability.
    for storage and retrieval of data, you'll want to use a database
    Again, that depends.
    Quote Originally Posted by Cups View Post
    Perhaps this is the answer to "why would anyone store xml in a database?".
    Performance. Applications that transform data to and from XML documents a lot it will usually be more efficient to just store the XML directly.

  16. #41
    SitePoint Addict
    Join Date
    Jan 2005
    Posts
    204
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i don't really get the meaning of "store xml in a database" and "more efficient to just store the XML directly". do you mean having another copy of the XML file with the data on the server or having server-side language to pull data from the database to form dynamic XML?

  17. #42
    SitePoint Guru
    Join Date
    Jul 2004
    Location
    Netherlands
    Posts
    672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    XML is good for interoperability between applications.. for example think about rss feeds you can share data from your webapp to a desktop rss reader.
    Or office documents with an open format which can be read between OOo and MS Office

    Also for storing hierarchal data like a menu it could be better then a database.

    Quote Originally Posted by x[E]nOn View Post
    in what way will a XML be faster than getting from a database? if XML is slower, then why would we even want to get things from XML when many things can just get directly from the database unless the data is need to pass to somewhere else?
    Go visit my site :-D you know you want to ;-)
    www.mech7.net

  18. #43
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Umm...

    > Also for storing hierarchal data like a menu it could be better then a database.

    That of course, depends on a few things; Such as the document size... If you parse a sizable document with the DOM for example, you consume a large amount of memory to contain that structure. With the DOM, it's all or nothing.

    And please, don't mention Simple XML as when it comes to serious parsing of a complex XML document, it's just not upto the job. At the end of the day, you've got to measure the performance of the tools available, for the job at hand.

    You shouldn't just use one over the other, purely for convienence, nor should you automatically discount one over the other, but experience will give you the push in the right direction

  19. #44
    PHP/Rails Developer Czaries's Avatar
    Join Date
    May 2004
    Location
    Central USA
    Posts
    806
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hrm.... I know it's been a few days, but I just read this mysql performace blog post and it seems as though if you are using MyISAM for your storage engine, a "SELECT COUNT(*)" will be instantaneous because the total number of rows for a table is cached. Even adding a WHERE caluse on the end will still be fast. Read the blog post for more info.

  20. #45
    SitePoint Guru
    Join Date
    Jul 2004
    Location
    Netherlands
    Posts
    672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    [quote=Dr Livingston;3409713]Umm...

    > Also for storing hierarchal data like a menu it could be better then a database.

    That of course, depends on a few things; Such as the document size... If you parse a sizable document with the DOM for example, you consume a large amount of memory to contain that structure. With the DOM, it's all or nothing.

    And please, don't mention Simple XML as when it comes to serious parsing of a complex XML document, it's just not upto the job. At the end of the day, you've got to measure the performance of the tools available, for the job at hand.
    /quote]


    Wel simpleXml is pretty good for just parsing, only not editing also you can go xslt but i ment more the nature of the document.. with xml it is very easy to store hierarchical data, with a database it can rather complex.
    Go visit my site :-D you know you want to ;-)
    www.mech7.net

  21. #46
    SitePoint Enthusiast
    Join Date
    Mar 2007
    Posts
    89
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i'm incredibly surprised that no one here has mentioned SQL_CALC_FOUND_ROWS yet

  22. #47
    SitePoint Addict
    Join Date
    Jul 2006
    Location
    Kansas City, MO
    Posts
    280
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Tom Brokaw:

    Nobody has mentioned SQL_CALC_FOUND_ROWS because it is not relevant. It's relevant when you are limiting a SELECT and you want to know, in addition, how many total rows would have been returned had the LIMIT not been in place. One very prevalent example would be in pagination.

    The original poster is asking how to do:

    Code:
    select thread_id, count(*) from replies group by thread_id
    And we are suggesting instead of doing that relatively expensive query to just do:

    Code:
    select thread_id,title,replies from threads
    and have a slightly more complex insert statements such as
    Code:
    insert into replies (thread_id,reply) values (1,'hello');
    update threads set replies=replies+1 where thread_id=1;
    The second method, although more time consuming on inserts, pays of dramatically due to the blazingly fast single query that gets the number of replies per thread without the overhead of counting each and every time.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •