SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 37 of 37
  1. #26
    ♪♪ ♪ ♪ ♪ ♪♪ ♪ ♪♪ Markdidj's Avatar
    Join Date
    Sep 2002
    Location
    Bournemouth, South UK
    Posts
    1,551
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I read this thread a couple of days ago and would like to ask something. It is very relevant.

    My site is live, if content changes it updates the users current view. The check is made every 10 seconds. I created a page that gets the RSS from the BBC feeds and embeds it into the current page. Rather than the script reading the BBC's RSS on the initial hit and every subsiquent "live" hit for each user, I check it once and save the HTML into its own file on the server. Subsiquent hits then check the time this was last done, if less than 10 seconds use the current saved html, else go to the BBC and see if the feed has been updated. This makes sure that if my site gets busy the most requests it will ever send to the BBC is 1 every 10 seconds. It also improves the execution time for that particular script. I was thinking of doing the same to my website. When I do I query on a database to build a public page instead of just outputting it to the client I save it first as HTML file on the server. All requests in the next 10 seconds use just the info from that file, either with includes or reading as XML and outputting to the screen. The first request after that 10 seconds repeats the process.

    A theory is, if they are going to scrape your content they will find a way of doing it, and speeding up the time it takes for you give them that data may be a good defense.

    Would something like that help?
    LiveScript: Putting the "Live" Back into JavaScript
    if live output_as_javascript else output_as_html end if

  2. #27
    Community Advisor silver trophy

    Join Date
    Nov 2006
    Location
    UK
    Posts
    2,554
    Mentioned
    40 Post(s)
    Tagged
    1 Thread(s)
    Caching is a good idea wherever there are performance issues due to high traffic, you might also consider compressing the output to reduce bandwidth use. On a really busy site you'd want to cache to memory rather than disk as it's a lot faster.

  3. #28
    ♪♪ ♪ ♪ ♪ ♪♪ ♪ ♪♪ Markdidj's Avatar
    Join Date
    Sep 2002
    Location
    Bournemouth, South UK
    Posts
    1,551
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thanks EastCoast. How do I go about compressing output from a server so it is still readable by a browser? I'm new to this part of webdev.
    LiveScript: Putting the "Live" Back into JavaScript
    if live output_as_javascript else output_as_html end if

  4. #29
    Non-Member
    Join Date
    Oct 2007
    Location
    United Kingdom
    Posts
    622
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    A few things you can do:

    • Remove white space formatting from the html
    • Make sure images are as small as they can be without compromising quality too much
    • Using sprites help
    • Avoid inline CSS and JavaScript
    • Avoid table layout or excessive divs
    • Try and keep CSS and JavaScript files short - remove unnecessary code, and remove formatting white space
    • AJAX can reduce page requests, but be careful about it's effects on accessibility.

  5. #30
    ♪♪ ♪ ♪ ♪ ♪♪ ♪ ♪♪ Markdidj's Avatar
    Join Date
    Sep 2002
    Location
    Bournemouth, South UK
    Posts
    1,551
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ro0bear View Post
    A few things you can do:

    • Remove white space formatting from the html
    • Make sure images are as small as they can be without compromising quality too much
    • Using sprites help
    • Avoid inline CSS and JavaScript
    • Avoid table layout or excessive divs
    • Try and keep CSS and JavaScript files short - remove unnecessary code, and remove formatting white space
    • AJAX can reduce page requests, but be careful about it's effects on accessibility.
    Nice. I've been doing most of that already by outputting my pages with JavaScript instead of HTML and designing my site with mobile mode compatability as a high priority. I'm heading in the write direction
    I thought it may have ment outputting in binary or something like that, which puzzled me a bit

    The good thing about live server coded javascript is named variables and functions can be named in comments in the server side code. Here's my server side javascript for cookies
    Code:
    '---- Cookies ----'
    
    's=cookie name
    't=default value
    
    response.write "function getCookie(s,t){"
     response.write "a=document.cookie.split("";"");"
     response.write "for(i=0;i<a.length;i++){"
      response.write "b=a[i].replace("" "","""").split(""="");"
      response.write "if((b.length==2)&&(b[0]==s)) return unescape(b[1]);"
     response.write "}; return t;"
    response.write "};"
    (first example I could find). So although I might get confused if it did it like that in javascript, I don't so much when I use live script.
    LiveScript: Putting the "Live" Back into JavaScript
    if live output_as_javascript else output_as_html end if

  6. #31
    Life is not a malfunction gold trophysilver trophybronze trophy
    TechnoBear's Avatar
    Join Date
    Jun 2011
    Location
    Argyll, Scotland
    Posts
    6,352
    Mentioned
    268 Post(s)
    Tagged
    5 Thread(s)
    Quote Originally Posted by Markdidj View Post
    Thanks EastCoast. How do I go about compressing output from a server so it is still readable by a browser? I'm new to this part of webdev.
    If you're using Apache, you can add something like this to your .htaccess file:
    Code:
    SetOutputFilter DEFLATE
    
    <FilesMatch "\.(js|css|html)$">
    SetOutputFilter DEFLATE
    </FilesMatch>
    which will compress files of the types specified. http://httpd.apache.org/docs/2.2/mod/mod_deflate.html

  7. #32
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    6 Thread(s)
    Quote Originally Posted by Markdidj View Post
    Thanks EastCoast. How do I go about compressing output from a server so it is still readable by a browser? I'm new to this part of webdev.
    Hi,

    This is normally done at the web server level, for example, if using apache then mod_cache is configured. Alternatively if using PHP then one can use output buffering to perform simple caching, or there are a number of PHP accelerators that perform caching as well as compression you can find a lot of info via Google but here is a Wikipedia article. Opcode caches eliminate many inefficiencies during execution phases on the server. You can also cache variables.

    If you have a site where content doesn't change regularly then it likely is a candidate for file caching. You can look at APC or one of PEAR caching libraries.

    For memory caching you might look at memcache where their claim to fame is (from their home page)
    Free & open source, high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
    Steve
    ictus==""

  8. #33
    ♪♪ ♪ ♪ ♪ ♪♪ ♪ ♪♪ Markdidj's Avatar
    Join Date
    Sep 2002
    Location
    Bournemouth, South UK
    Posts
    1,551
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    MMM thanks. That leads me on to another thing I've been puzzling over as well. My live scripts that I allow to cache because they only contain reused functions seem to work a lot faster from the cache. Does the browser compile javascript before caching it?

    Also, I read somewhere recently that files sent from the server can have a last modified date in the header and it's possible to get the browser to compare the last modified date of a file on the server to the one in the cache. How would I go about implementing that, where the file is only got if the last modified header date is newer?
    LiveScript: Putting the "Live" Back into JavaScript
    if live output_as_javascript else output_as_html end if

  9. #34
    Foozle Reducer ServerStorm's Avatar
    Join Date
    Feb 2005
    Location
    Burlington, Canada
    Posts
    2,699
    Mentioned
    89 Post(s)
    Tagged
    6 Thread(s)
    Quote Originally Posted by Markdidj View Post
    MMM thanks. That leads me on to another thing I've been puzzling over as well. My live scripts that I allow to cache because they only contain reused functions seem to work a lot faster from the cache. Does the browser compile javascript before caching it?

    Also, I read somewhere recently that files sent from the server can have a last modified date in the header and it's possible to get the browser to compare the last modified date of a file on the server to the one in the cache. How would I go about implementing that, where the file is only got if the last modified header date is newer?
    Hi

    Search on 'Output buffering caching
    ictus==""

  10. #35
    SitePoint Member Igal Zeifman's Avatar
    Join Date
    May 2012
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi

    Regarding your initial question.

    First of all, in reference to your bandwidth question I wanted to point out a study released by our company <snip>removed advertising</snip> just a few month ago. It analysed traffic data of several thousands websites and, in the end, showed a 50% and more bot generated traffic (80% for smaller sites). On average, 31% of those visits were made by malicious intruders (spammers, scrapers and etc).

    <snip>removed advertising</snip>

    Finally, as suggested, limiting access from irrelevant Geo-locations may sound as a good idea, but before doing so you should know that legitimate bots may sometimes use "weird" IPs - for example, Googlebot can originate from China and I`m sure you don't want to block that...
    Reference: http://productforums.google.com/foru...rs/rEZQskC884s

    So I would not recommend setting any non-specific rules, at least not before checking IP ranges for the most important bots out there.

    Hope this helps.
    Last edited by TechnoBear; Aug 15, 2012 at 08:02. Reason: removed advertising and promotion links

  11. #36
    SitePoint Wizard
    Join Date
    Oct 2005
    Posts
    1,849
    Mentioned
    5 Post(s)
    Tagged
    1 Thread(s)
    Just thought I would mention that even though I have an entry in robots.txt to ban that Brandwatch magpie-crawler bot (and have for a long time) that bot will not go away. I have an entry in htaccess to give the bot 403 Forbidden codes. I've been doing that for a couple years now and it still will not go away. It's making upwards of 8 page requests per second at times. That has to be one of the worst bots I've come across.

  12. #37
    Community Advisor silver trophy

    Join Date
    Nov 2006
    Location
    UK
    Posts
    2,554
    Mentioned
    40 Post(s)
    Tagged
    1 Thread(s)
    Brandwatch are on twitter might be an idea to give them some constructive criticism that is publicly visible on there


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •