Content Delivery Networks (CDN) : Get to the Edge!By Vishal Biyani
“A picture is worth thousand words” was the phrase coined in early 1920s and was relevant as late as 2000s. But trust me the world has literally moved on! The world is defined today by digital content like videos. Record on your mobile device with 8mp camera, upload to a tube (Any tube;) ), and share! Growth of digital content is defying the Moore’s law, and in much positive way. Technologies to handle such astronomical data growth are evolving and name is apt too: Big data!
Now wherever this data resides, it’s very important that this data gets to end user without user having to wait for long! It may be case of you watching a movie on Netflix or trying to buy your next shoe online by checking the high definition images and videos of it on retailer’s website, you would want it to be fast and quick. You must be saying, well it’s fast enough and I never faced any issues! Guess what? they are already at the edge and closer to you in network, in short they are using Content Delivery Network (CDN).
How does CDN work?
At the minimum, CDN is a system which has large and distributed servers across geographies. CDN can serve the content faster to end user and make sure it is highly available. Let’s take a basic scenario to understand things better without and with CDN. In a world without CDN, you would host all of your content at a single point. All the traffic would be directed at the same server which might result in single point of failure. As an example your server is hosted in USA and users from Australia will make a round trip all the way to USA servers to fetch the content. Few thousand users requesting the same content will mean thousands of round trips resulting in higher bandwidth consumption, lower speed for end user and slow page loads/jitters in streaming and media loads.
Welcome to CDN: You have servers distributed across the world. Content is replicated across servers either on demand or based on certain logic. When a user requests a file, an intelligent algorithm finds out the closest possible server to user and redirects file delivery to that server. Moreover the server closer to you might have cached a popular file due to many requests and it gets delivered you much faster! That’s the promise of CDN. In previous example we considered, if a user from Australia requests a video and it is not currently available on the server on Australia, then the edge server in Australia will on demand fetch a copy from server in USA. All subsequent requests coming for the same file will be served by the edge server in Australia and will be much smoother experience for users.
Techniques used in CDN
Web caches are frequently used to cache content for certain durations of time. One of scenarios where this serves to be extremely useful is a file being repetitively queried by various users. Think launch of a new product, or a video going viral! Server load balancing is another technique often used among servers or web caches. This adds to scalability, and also increases reliability by removing risk of single point of failure. Load balancing is a big topic in itself and best treated separately. On top of this, a variety of algorithms are used in determining server to deliver the file from. Algorithms determine server to choose a file based on various factors like availability of server in past, number of network hops needed for file retrieval etc. Over period of time, algorithms have grown to gather data on traffic flow, congestion and actively report those events. On top of that, the known congestion points can be used to better route the traffic.
There are certain factors to be considered while evaluating CDN of choice for your website/content:
- Geography: What are the geographies you want to serve for your current and future user base. Does the CDN provider has servers in all those geographies to serve all of your user needs? If not what are the plans in near future.
- Cost or performance: Motives of implementing CDN can be either increasing the performance or reducing cost of existing bandwidth. While both objectives tend to align in this case, there might cases where you have to weigh one over other. In terms of cost, you might want to look at various options like buying a fixed quota for a fixed price for certain duration, or a purely month to month subscription for CDN as a service.
- Replication: What your applications needs for replication and is that satisfied with given service provider within your budget?
Benefits of CDN
End users are the biggest beneficiaries : page load times are faster, loading of big files like media content is without “jitter”! By making multiple copies of data at different data centers and distributing across world prevents single point of failure, while backing up your data in process. Even if one of the servers goes down for some reason, the traffic can be routed to other nearby servers without affecting end users. One of important but easily not identified benefits of using CDN is conservation of bandwidth. By redirecting the requests to servers closer to user, the traffic on source server reduces which is significant reduction in traffic and bandwidth costs overall.
Akamai is one of biggest and oldest CDN providers. Amazon has a CDN offering- Amazon CloudFront and Rackspace offers Cloud files and uses Akamai’s services. For starters some CDN providers offer free service to certain limit of data like CloudFlare or Incapsula. Telecom providers too have entered this space, like AT&T in US or Bharti Airtel in India. One of advantages of telecom players is they already have the last mile connectivity and infrastructure investments in place.
In essence, Content delivery networks are based on End-to-End principle. One hand they improve experience for end users, on the other end they provide high availability and fail-over mechanism. Do keep CDN in mind when you architect your next gen media centric website for your global audience!