SitePoint Sponsor

User Tag List

Page 3 of 4 FirstFirst 1234 LastLast
Results 51 to 75 of 98
  1. #51
    SitePoint Zealot
    Join Date
    Feb 2005
    Location
    UK
    Posts
    121
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I meant queries that get run each time through a loop ie, query for a resultset, then loop through the resultset and run another query for each row. I see that kind of thing over and over again in these forums, and in live websites. Likewise the 'select *' then bung it into an array and sort/filter/merge/search for specific data, all of which could have been done with a single well-constructed query. As it sez in the manual, sql is 10 to 20 times faster than code IN ANY LANGUAGE when it comes to dealing with large recordsets.

    The thing I find hard to understand is the number of excellent coders who can't get their heads around JOINS?? They can astound me with their mastery of regular-expressions, make arrays jump through hoops, and yet cannot grasp just what is going on with a left join. Go figure?

  2. #52
    eschew sesquipedalians silver trophy sweatje's Avatar
    Join Date
    Jun 2003
    Location
    Iowa, USA
    Posts
    3,749
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Roger Ramjet
    The thing I find hard to understand is the number of excellent coders who can't get their heads around JOINS??
    And then aggregate functions and group by clauses
    Jason Sweat ZCE - jsweat_php@yahoo.com
    Book: PHP Patterns
    Good Stuff: SimpleTest PHPUnit FireFox ADOdb YUI
    Detestable (adjective): software that isn't testable.

  3. #53
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> (try that on a table with a million entries, then come and tell me it was a good idea).

    Try telling any data guru that having a table with a million entries in your OLTP is a good idea

    You should never really have to worry about extremely large data sets in your OLTP, because you should never really have extremely large data sets in your OLTP. Your OLTP should really only contain what is necessary for the online system. Everything else should be offloaded, and maybe even de-normalized, to a data warehouse.

    But you're really going it about the wrong way if you find yourself with millions of rows in your online system.

    Everybody seems to talk about "enterprise programming" but then completely ignore aspects of "enterprise databasing" (I know that's not a real word, but you know what I mean).

  4. #54
    SitePoint Zealot
    Join Date
    Feb 2005
    Location
    UK
    Posts
    121
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OLTP ?? dunno what that means.

  5. #55
    SitePoint Wizard
    Join Date
    Aug 2004
    Location
    California
    Posts
    1,672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Either On-Line Transaction Processing or Old Ladies Tea Party, I can't tell exactly.
    Christopher

  6. #56
    SitePoint Mentor bronze trophy
    John_Betong's Avatar
    Join Date
    Aug 2005
    Location
    City of Angels
    Posts
    1,821
    Mentioned
    73 Post(s)
    Tagged
    6 Thread(s)
    Hi Etnu,

    MySQL already does this by default (you can disable it). When you delete a key from an index, it leaves the node as null. running OPTIMIZE TABLE removes those null nodes.
    Many thanks Etnu, I thought that MySql was sufficiently mature to handle this but nice to have confirmation.

    Cheers,


    John_Betong

    http://www.anetizer.com

  7. #57
    SitePoint Zealot howardroark`'s Avatar
    Join Date
    Feb 2005
    Posts
    107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    why not use a scalable CMS like Drupal or Civicspace?

  8. #58
    SitePoint Addict mx2k's Avatar
    Join Date
    Jan 2005
    Posts
    256
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    because its more fun to write your own.

  9. #59
    SitePoint Addict myrdhrin's Avatar
    Join Date
    Jul 2004
    Location
    Montreal
    Posts
    211
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dreamscape
    You should never really have to worry about extremely large data sets in your OLTP, because you should never really have extremely large data sets in your OLTP. Your OLTP should really only contain what is necessary for the online system. Everything else should be offloaded, and maybe even de-normalized, to a data warehouse.
    I second that big time... there are different databases structure for different purposes.

    OLTP purposed database should be quick with just enough information to achieve it's purpose (which is get your transactions going through)...

    Warehousing purposed database can be slow... will host loads of data and are expected to be "slower"... they are used like their name says... warehousing... keeping the data for a long period of time...

    You also have Data Analysis purposed database... quick, transient to make specific computation and produce results....

    Hum.. I'd need to talk with my DBA for the terminology (yeah DBA are usefull to something *grin*)
    Jean-Marc (aka Myrdhrin)
    M2i3 - blog - Protect your privacy with Zliki

  10. #60
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Everything else should be offloaded, and maybe even de-normalized, to a data warehouse.
    What your talking about is archiving your data, which is acceptable. You remove stale data from the database after x amount of time to somewhere else, ie Another database that is in the background and not generally part of the system.

    The data is still reachable but since it's not active, there is no reason for it in the main database, so yes, there really isn't any excuse for 1,000,000 odd records. A bit silly really

    why not use a scalable CMS like Drupal or Civicspace?
    Who said they were scalable huh? What benchmarks are there that would support this is what I ask. Plus, it is not a dead cert that these CMS would suit my needs, or the needs of my clients either for that matter.

    To me, these open source CMS only support a very small number of the population, and their needs at the very best. As the old saying goes... "If you want something done, you're better to do it yourself" comes to mind immediately.

  11. #61
    SitePoint Zealot
    Join Date
    Feb 2005
    Location
    UK
    Posts
    121
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I thought that Edman started this thread because he needed to optimise AN EXISTING WEBSITE - not write a new one?? It seems to have been highjacked and wondered into all sorts of odd arenas.

  12. #62
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> Who said they were scalable huh?

    Any PHP application is inherently scalable (unless it has been very very very poorly coded). It is not PHP's job to scale the system; it will naturally scale with your system. If you need more application servers, you add more application servers and load balance them by whatever method you prefer, for example. If you need more database servers, it is up to the database server to cluster itself. The only problem may come with sessions, but if you store them in the database, there is no problem.

    >> What benchmarks are there that would support this is what I ask.

    Performance does not equal scalability.

  13. #63
    SitePoint Addict mx2k's Avatar
    Join Date
    Jan 2005
    Posts
    256
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    shouldn't performance be a by-product of scalability? if your app can only handle 100 users and the other one can handle 20,000..... which is going to let you scale more?

  14. #64
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> if your app can only handle 100 users and the other one can handle 20,000

    xx user what? At one time? the max total (not necessarily at one time)?

    If max total, then the one was coded with a hard limit of 100 users, probably at the developer's choice, most likely for different licensing tiers.

    What PHP app can handle 20,000 concurrent users on a single server? What single server can even parse 20,000 concurrent PHP requests? For that matter what single server can serve 20,000 concurrent static HTML files?

    At least give realistic examples for god's sake.

    If you ever find your current app server cannot handle all the needed request, add another one. PHP doesn't try to replace the web server (apache) or database server. It is designed to work well with them. Because apache scales well, because most database servers scale well, and because each PHP transaction is stateless, PHP "naturally" scales well and doesn't care if the user jumps from server A to B to Q to A again. Finding the correct session might be a problem, but storing them in the database solves this issue.

  15. #65
    SitePoint Addict myrdhrin's Avatar
    Join Date
    Jul 2004
    Location
    Montreal
    Posts
    211
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    dreamscape -> I would put a nuance to your comment as far as clustering and adding servers.

    There comes a point where concurrency of your transactions will become a bottleneck at which point no matter how many machines you put the performance will only degrade (I'm seeing it right now at work on a software that we're pushing to handle over simultenaous 10,000 users. We're talking between 2000-3000 queries a second for a benchmark).

    At that point (and that point only), changing your design can be the right approach...
    Jean-Marc (aka Myrdhrin)
    M2i3 - blog - Protect your privacy with Zliki

  16. #66
    SitePoint Zealot
    Join Date
    Feb 2003
    Posts
    156
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I have yet to see a single discussion on scalability or performance or ... that didn't end up in disagreement on what the definition of "scalable" is.

    If all you have is a single server, then execution speed and memory usage are probably the interesting factors to look at, and that's probably the focus of most people (on these forums) writing applications. Next step is probably to seperate out either the db or static files to a seperate server which covers half the people that are left. If your hardware is big enough this will go a loong way.

    Of course there are also situations where it is ridiculous to even think about running an application on only a handful of servers, it's clear you are going to need a whole park of servers. Then suddenly execution time or memory usage while not being irrevelevant, play a much much minor role. As long as there is a way to easily spread the load across machines, after all it doesn't really matter much, whether you are going to use 60 or 150 machines.

    And then there are coorprate applications where the main concern may be in scaling the development process, you want to be able to add people to the project and have them be productive and adding value in a short time.


    Without a concrete context of what is going to be expected and/or what the supporting architecture is roughly going to be like, there really is little value in asking for specific advice.

  17. #67
    SitePoint Evangelist Daijoubu's Avatar
    Join Date
    Oct 2002
    Location
    Canada QC
    Posts
    454
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dreamscape
    For that matter what single server can serve 20,000 concurrent static HTML files?
    Only Apache can't (LSWS 2 Pro acheived 37k requests/sec with 1000 concurrent connections)

    Speed & scalability in mind...
    If you find my reply helpful, fell free to give me a point

  18. #68
    SitePoint Addict
    Join Date
    Apr 2005
    Posts
    274
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You should never really have to worry about extremely large data sets in your OLTP, because you should never really have extremely large data sets in your OLTP. Your OLTP should really only contain what is necessary for the online system.
    What's neccessary for this online system is relations between users. On average, there are 190 relations per user, 380.000 users, that's over 70 million database entries. And every last one of them is needed RIGHT NOW.

    I thought that Edman started this thread because he needed to optimise AN EXISTING WEBSITE - not write a new one?? It seems to have been highjacked and wondered into all sorts of odd arenas.
    The joys of having the link on the forums home

    Only Apache can't (LSWS 2 Pro acheived 37k requests/sec with 1000 concurrent connections)
    Yes, of course, getting benchmarks from a site that markets its own webserver software is the best way to go as far as objectivity goes.

  19. #69
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> 380.000 users, that's over 70 million database entries

    how does 380,000 users automatically translate to over 70 million rows? Why does each user need 190 rows? Maybe you need to re-think your database design if that is the case.

  20. #70
    SitePoint Evangelist Daijoubu's Avatar
    Join Date
    Oct 2002
    Location
    Canada QC
    Posts
    454
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Edman
    Yes, of course, getting benchmarks from a site that markets its own webserver software is the best way to go as far as objectivity goes.
    All benchs has to be taken with a grain of salt :P
    Don't get my wrong, Apache is a full featured, compliant httpd, but a preforking server really doesn't compare to single-threaded server when it comes to scalability

    Here are more benchmarks, again, to be taken with a grain of salt
    http://weblog.textdrive.com/article/...dot-lighttpdly
    http://salinan.memoryhole.net/autobench/
    http://lighttpd.net/benchmark/
    Speed & scalability in mind...
    If you find my reply helpful, fell free to give me a point

  21. #71
    SitePoint Addict mx2k's Avatar
    Join Date
    Jan 2005
    Posts
    256
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by dreamscape
    >> if your app can only handle 100 users and the other one can handle 20,000

    xx user what? At one time? the max total (not necessarily at one time)?

    If max total, then the one was coded with a hard limit of 100 users, probably at the developer's choice, most likely for different licensing tiers.

    What PHP app can handle 20,000 concurrent users on a single server? What single server can even parse 20,000 concurrent PHP requests? For that matter what single server can serve 20,000 concurrent static HTML files?

    At least give realistic examples for god's sake.

    If you ever find your current app server cannot handle all the needed request, add another one. PHP doesn't try to replace the web server (apache) or database server. It is designed to work well with them. Because apache scales well, because most database servers scale well, and because each PHP transaction is stateless, PHP "naturally" scales well and doesn't care if the user jumps from server A to B to Q to A again. Finding the correct session might be a problem, but storing them in the database solves this issue.
    you act as if everyone is in the boat to throw a server in here and there, ... In alot of circumstances that maybe out of the question. again, just because the language maybe able to scale, does not mean all that open source code out there is ready to scale. Thats niave, too hopeful, and too bold a statement.

    many php applications are not even set to be broken into tiers except maybe the data and everything else (mysql & php). (don't get me wrong, there a number of applications designed by people in this forum in particular that could do so, but thats not the majority of widely used open source php apps like phpbb).

    not to mention you have to do with what you can to work inside of your clients budget.

    (the over exaggeration is always given for dramatic emphasis. and i apologize if it stirred up any hard feelings or what not).

  22. #72
    SitePoint Addict
    Join Date
    Apr 2005
    Posts
    274
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> 380.000 users, that's over 70 million database entries

    how does 380,000 users automatically translate to over 70 million rows? Why does each user need 190 rows? Maybe you need to re-think your database design if that is the case.
    There is absolutely no real way to do that. Each user has relationships with other users. All relationships need to be stored. There just are some 70 million of them. I can cut that in half by storing only relationships on one person's part, but that's still 35 million, it translates into 2 queries instead of just 1, and the point still stands.

    The table is simple. It's like

    Code:
    ----------------------------
    |  UserID    |  FriendID   |
    ----------------------------
    |      1     |      23     |
    |      234   |      1231   |
    |     ....   |      ....   |
    ----------------------------
    I can't possibly imagine a way to optimise that, especially considering that some users have a thousand relationships.

    I could, of course, get rid of the link table completely and store friend IDs in a single cell in the users table, but then I'm not sure how the other stuff, like finding friend's friends and friends online is gonna work Well maybe it can work like that. I'm not sure, I'l lhave to look through it.

  23. #73
    SitePoint Evangelist
    Join Date
    Mar 2005
    Posts
    421
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Roger Ramjet
    QUERIES INSIDE LOOPS ARE WRONG - ALWAYS AND EVERY TIME.
    I've read quite a few different people say this, so i was wondering if the following qualifies as a query inside a loop.

    Say i have a class, clsPictureFinder, that has a static method called GetAllPics which returns an array containing all the picture database ID's on a certain table. I defer the object creation of an individual Picture object until i'm actually looping through them, by passing the pictureID to the constructor, which is in turn triggering another query, to populate the objects properties- a query inside a loop? eg:
    PHP Code:
    $arrPicIDs = array();
    $arrPicIDs clsPictureFinder::GetAllPics();
    foreach(
    $arrPicIDs as $key=>$value)
    {
       
    $objPicture = &new clsPicture($value);
       echo 
    $objPicture->getFilename() . '<br />';
       echo 
    $objPicture->getOwner() . '<br />';
       echo 
    $objPicture->showThumbnail();
       echo 
    '<hr />';
       unset(
    $objPicture);

    Is this bad practice?

  24. #74
    eschew sesquipedalians silver trophy sweatje's Avatar
    Join Date
    Jun 2003
    Location
    Iowa, USA
    Posts
    3,749
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by skinny monkey
    Is this bad practice?
    No, the bad practice would be:
    PHP Code:
    foreach($pic->getCategories() as $cat) {
      
    $pic->getPicsByCat($cat); //query goes on in here
      //...

    If you are going to hit the db for all categories anyway, restructure you code to only hit the db one time and use the info as appropriate when you need it.

    As with all advise, season to taste. I have an application where the inner queries run quickly, and the code was easier to write with 13 queries as opposed to one query with fancy caching and looping, so I wrote it with the multiple queries in a loop structure. I pumped the code out quicker and the users are fine with the application performance.

    The important thing is to be aware of the impact of your design choises on the application and the server.
    Jason Sweat ZCE - jsweat_php@yahoo.com
    Book: PHP Patterns
    Good Stuff: SimpleTest PHPUnit FireFox ADOdb YUI
    Detestable (adjective): software that isn't testable.

  25. #75
    AdSpeed.com Son Nguyen's Avatar
    Join Date
    Aug 2000
    Location
    Silicon Valley
    Posts
    2,241
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Design the application so that when one powerful server is not enough, it can be operated on multiple servers.

    An example: queries should follow the format db_name.table_name for replication of individual databases.
    - Son Nguyen
    AdSpeed.com - Ad Serving and Ad Management Made Easy


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •