SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Addict
    Join Date
    May 2008
    Location
    Missouri, USA
    Posts
    273
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Web Services BottleNecks

    I'm working on my first web services based project. I'm having difficulty understanding one concept and was hoping someone could point me in the right direction to understand it better.

    I'm using web services to build an application that in a live environment would have to handle thousands of user requests a minute. For me this is a big undertaking and I'm not sure where to begin when it comes to performance. From my understanding of web services there is going to be a large bottleneck located at the service endpoints. A very simple example would be a polling feature. Say you have 1000s of users voting in a poll simultaneously. When this happens there is going to be a bottleneck where the client interacts with the webservice (endpoint) and where the service interacts with the database (data connection). How do you typically go about distributing the work load of high traffic web services in order to avoid a bottleneck?
    Follow Me On Twitter: BryceRay

  2. #2
    SitePoint Wizard
    Join Date
    Dec 2003
    Location
    USA
    Posts
    2,582
    Mentioned
    29 Post(s)
    Tagged
    0 Thread(s)
    You have lots of options, and what you do depends on available funds, software, etc.

    The easiest (and cheapest) methods are to make sure you have really good code. There are several big areas that can make a lot of difference:
    - Make sure you minimize the number of queries you use. NEVER have a query inside of a loop, instead, work out a joined or compound query and process the results in a loop. This is a huge performance boost/resource saver.
    - Avoid unnecessary nested loops. Instead of nesting a loop, see if you can come up with a way that you only need a single loop instead.

    We could write a whole book on ways to improve your code, but those are probably the biggest two.

    If you have really good code and you still have a bottle neck you can try:
    - Upgrading your server (if you're on a shared host, get a dedicated, if you're on a dedicated, get a better one).
    - Spread your resources out onto multiple servers. You can move all images to one server, all static files to another, and all dynamic files to another (or more than one). This will spread the load along multiple machines.
    - Make use of a queue: if it is something that takes lots of resources, consider adding them to a queue on your server and process them as you go along and then return the results to them one at a time.

    There are more tips, but these are probably the best three that I can think of.

    If you try all of these and there is still a substantial bottleneck, you probably are trying to do something that isn't really possible...

  3. #3
    SitePoint Addict
    Join Date
    May 2008
    Location
    Missouri, USA
    Posts
    273
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That all seems like sound advice. I had been considering implementing a queue so that the data connection did not become as much of a bottleneck. Good programming should take me a long way.
    Follow Me On Twitter: BryceRay

  4. #4
    SitePoint Wizard
    Join Date
    Dec 2003
    Location
    USA
    Posts
    2,582
    Mentioned
    29 Post(s)
    Tagged
    0 Thread(s)
    Oh, I forgot one more great tip:
    Make good use of Sitepoint.com. =p

    If you come across something small that seems to be running slower than you think it should be able to, ask on Sitepoint. Sometimes there are goofy little things that you overlook but can cause huge performance hits.

  5. #5
    SitePoint Guru glenngould's Avatar
    Join Date
    Nov 2005
    Posts
    661
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    - Make use of a queue: if it is something that takes lots of resources, consider adding them to a queue on your server and process them as you go along and then return the results to them one at a time.
    -What are the possible options to queue requests?

    -I also would like to read about database connections (MySQL preferably) in high traffic websites. Like, what happens when there are way too many queries at a time to process etc. Any resources you can suggest me to read appreciated.
    Tweep List adds an avatar menu to Twitter (open source)
    Word Stats shows your most used words on Twitter

  6. #6
    SitePoint Wizard
    Join Date
    Dec 2003
    Location
    USA
    Posts
    2,582
    Mentioned
    29 Post(s)
    Tagged
    0 Thread(s)
    One common example is a queue is download websites (such as Filefront) which places you into a queue. There are a number of ways to go about it, and it depends on the server you're running.

    The simpler methods for a general queue would be:
    - Add requests to the database
    - Either have a cron job set up that pushes out updates, or have Javascript cause the page to refresh every so often. Once they get the okay, whatever is given to them.

    When you have too many database connections you will likely get a too many connections error. One thing you can do (that is done t many larger sites) is to have the database constantly mirrored across multiple servers, which can then communicate with each other. You have to be careful with this though because, depending on your task, unsynced data could be problematic (and the syncing may require more resources than keeping it all together).

    I'll see if I can find some more details for you. (I was actually just about to post a new blog entry on this very topic at my site htmlblox.com, I'll let you know when I get done and post some excerpts from it).

  7. #7
    SitePoint Wizard
    Join Date
    Dec 2003
    Location
    USA
    Posts
    2,582
    Mentioned
    29 Post(s)
    Tagged
    0 Thread(s)
    I added the new entry. There isn't a whole lot of additional new information, but it does expand on each solution in a bit more detail.

    Also, if you are specifically interested in handling more queries at one time then a normal machine could handle, check out MySQL Cluster.

    The entry can be read here: http://htmlblox.com/blog/html/preven...r-bottlenecks/

  8. #8
    SitePoint Wizard cranial-bore's Avatar
    Join Date
    Jan 2002
    Location
    Australia
    Posts
    2,634
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This website - High Scalability - may have some useful reading about scaling

    From what I gather the database is usually the first and most significant bottleneck for high traffic. You can reduce DB load with memcache, and cache just about any type of data, including rendered HTML chunks in the memory of the webserver to avoid re-querying.

    If your application is read heavy (high ratio of SELECTs to UPDATES and INSERTS) then database replication is a common choice. All write operations are sent to a master server, and slaves replicate the same data. The read operations are distributed amongst the slaves to spread the load.

    As mentioned offloading static files to a CDN can free up server resources, and make the site more responsive (through lower latency) for your users too.

    I think a key thing to realise when planning to scale is that you have to design the solution around your app. You'll need to profile your code and consider the nature of your traffic and how your code and other systems (DB) are used to come up with a good solution.

  9. #9
    SitePoint Member
    Join Date
    Nov 2009
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Bar

    I agree with you. With all my experience, I recently handle a forum site not to mention the url and it was really tough since every seconds I am receiving a query. Well, the best to make this possible to achieve is get in another site admin since it can't be done by one head a lone.

    Challenging right? It kind of a race without finish line since people are keep on joining and making stuff all the time.

  10. #10
    SitePoint Co-founder Matt Mickiewicz's Avatar
    Join Date
    Jul 1999
    Location
    Vancouver, Canada
    Posts
    2,384
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try a Content Delivery Network such as NetDNA, Akamai, Limelight, or Amazon's CloudFront.
    Matt Mickiewicz - Co-Founder
    SitePoint.com - Empowering Web Developers Since 1997
    Follow me on Twitter.

  11. #11
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,653
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Personally, I'd get something running and get an idea of the load I could handle before worrying about optimization and performance issues. Trying to solve problems you don't have tends to begat more problems rather than necessarily solving anything.

    I'd also note most folks are shocked about how much throughput one can get on a web/db app with reasonable hardware. More often than not, you run out of bandwidth before you run out of CPU or disk I/O.

  12. #12
    SitePoint Enthusiast
    Join Date
    Dec 2003
    Location
    norway
    Posts
    61
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you use MySql then remember to use InnoDB type of tables since they avoid "table locking" when doing updates etc. There is an article here on SitePoint that topic as I remember..

  13. #13
    SitePoint Addict
    Join Date
    May 2008
    Location
    Missouri, USA
    Posts
    273
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    A lot of advice here it will take me a while to absorb most of it but one quick technical question. In its current state the entire project is based on microsoft technologies (WCF Service, ASP.net Client Application, SQL Server). Everyone here is mentioning mysql. In the past i've always used mysql but with this project I decided to give ms sql server a try. My reasoning for this is that I thought sql server would integrate earlier. However, I may be inclined to switch if mysql proves to be a faster solution. What does your experience tell you?
    Follow Me On Twitter: BryceRay

  14. #14
    SitePoint Author silver trophybronze trophy
    wwb_99's Avatar
    Join Date
    May 2003
    Location
    Washington, DC
    Posts
    10,653
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Reason you hear alot of PHP/MySql suggestions is that SitePoint Forums has alot of PHP/MySql developers. Anyhow, experience tells me that any of them, with proper design and infrastructure, can handle the load. WCF especially was designed to scale very easily using configuration alone. In this case, you should check out Asynchronous Clients as that will take care of 95% of your problems since this is a firehose-style write scenario--users don't need poll results (or at least 100% accurate poll results), so you can fire off an asynch request to update and avoid any blocking issues.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •