Thanks to Adrian Sandu, Panayiotis Velisarakos and Tim Severien for kindly helping to peer review this article.
The Inter-Planetary File System (IPFS) is a revolutionary model that could change the way we use the Internet. Unlike the typical server-client model we’re accustomed to, IPFS is something more like BitTorrent. Does that grab your attention? Then read on!
The Problems With Today’s Web
The Hypertext Transfer Protocol (HTTP) is the backbone of the World Wide Web. We use HTTP to access most of the Internet. Any website we visit, typically, is via HTTP. It’s essentially a server–client mentality, where our computer sends requests to the server hosting a website, and the server sends back responses.
HTTP, though, lends itself naturally to a narrower and narrower subset of services. It’s natural for large services to emerge as the sort of structure of a large portion of the Web, but that sort of centralized environment can be dangerous. If any of the large hosting companies and/or providers of services – such as Google, Microsoft, Amazon, Dropbox, Rackspace, and the like – were to suddenly falter, the results to the Web would be disastrous in the short term. And herein lies the problem (at least one of them).
In addition to the natural process of centralization that’s occurring, there’s also a troubling reliability issue with today’s web. Most websites and applications are hosted by a single server, or by a redundant array of load balanced servers, or whatever the case may be. If the owner of those servers, or the datacenter’s management, or even a natural disaster, takes those machines out, will the application continue to run? Backups and redundancy can be put into effect by organizations with enough resources, but even those can’t stop a company which simply decides to take down their website or application.
Reliance on Hosts
If and when the server hosting a site goes down, we’re now reliant on the hosting company to have fail safes, redundant systems, backups, etc. They must recognize that your service is out, and assist you in restoring it. If it’s a hardware issue, they should have alternative systems they can port your setup onto. They should have backup networking systems, and they should be keeping at least a backup of your data, whether they advertise it or not, in the event of a data loss situation that is their fault.
What if they don’t?
Reliance on Site Administrators
Now the impetus falls on site administrators to keep a service going and data backed up. If you’ve ever been an avid user of an application that was suddenly removed, you know this feeling.
Movements to open source help tremendously, allowing multiple forks of a project to take off, and allowing things that are more static – like documentation – to be preserved in multiple locations and in multiple formats. But the fact remains that the majority of the Web is controlled by people like you or me, maintaining servers.
Some freelance developers even manage the hosting and maintenance of some of their smaller clients’ sites. What if they forget to pay their bill? Get angry with a client and lock them out of their site? Get hit by a truck? Yes, the site owner may have legal options in any of these cases, but will that help you while your site is completely inaccessible?
Reliance on Users
Yet one more problem is that of the users of any web application. Content often must have a critical mass of users or visitors to even merit hosting. Often low-traffic applications or static sites are shuttered simply because they aren’t cost effective to run. Additionally, the reverse problem is also very real. Users of the modern Internet are still clustering together. Facebook – which is a single social network – has somewhere in the ballpark of one out of every five persons on the face of the Earth reported as active users. There are countless businesses who entirely depend upon Facebook to exist. What if it shut down tomorrow?
Of course, Facebook won’t shut down tomorrow, and neither will most of the apps you love and use. But some may. And the more users that have flocked to them before that happens, the more damage that will cause to everyday workflows, or even to personal and business finances, depending on what kind of applications you use and for what.
The Answer is IPFS
So, you may be asking, how does IPFS solve these problems? IPFS is a relatively new attempt to solve some of these issues using distributed file systems. The IPFS project is still fairly low on documentation, and is perhaps the first of many different solutions.
First and foremost, you should understand a few things about IPFS. IPFS is decentralized. Without a typical server providing web pages for every client that arrives at the website’s domain, a different infrastructure must be imagined. Every machine running IPFS would be a
node as part of a
Consider the way torrents currently work. You choose a file to download, and when you use a torrent application to do so, you’re essentially sending out a request to all of the computers attached to the same torrent network as you, and if any of them have the file you’re requesting, and are able to upload at the moment, they begin sending pieces of it to your computer. That’s a condensed version.
So how do IPFS nodes work? Each machine that’s running IPFS is able to select what files they want their node to serve.
Hashing and IPNS
Every file that exists on IPFS would have a unique hash to represent it, and any minute change would result in a new hash being generated. These hashes are how content can be viewed. A client queries the system for a hash, and any node that has that content available can serve it to peers. The “swarm” provides a torrent-like experience, wherein peers are capable of serving each other content.
This system will allow content to be served quickly and accurately to clients, regardless of their proximity to the original host of the content. Additionally, because hashes are employed, both ends of the exchange can be checked for correct content, as a single bit out of place would result in a different hash.
The Inter-Planetary Naming System (IPNS) can be used to assign a name to mutable (changeable) content, so that your node publishes a piece of content, has a name attached to it, and then is able to republish changes with the same name. This, of course, could result in loss of available content, so IPNS entities, according to the developers, may some day function more like a Git commit log, allowing a client to iterate back through versions of the published content.
Advantages of Decentralization
So, you’ve heard all about centralization and decentralization. But what are the practical benefits of the fact that IPFS is decentralized?
Reliability and Persistence
The content being served on the IPFS network is going to be around, essentially, forever, if people want it to be. There’s not any single weak link, server, or failing point. With larger files, there may be a benefit to having multiple peers as options for your IPFS to choose from to acquire the file. But the real benefit comes from having those multiple options to start with. If one node hosting it goes down, there will be others.
Secured Against DDoS-style Attacks
Simply by its nature, distributed peer to peer content cannot be affected by “Direct Denial of Service” style attacks. These attacks are primarily concerned with bombarding host servers to bring down websites or services. However, if the same content is being served to you from multiple peers, an effective DDoS attack would have to find and target all of them.
Previously Viewed Content Available Offline
With the caching system in place with IPFS, it’s entirely possible that quite a lot of your regularly viewed content would be available offline by default. Any dynamic content might not be up to date, of course, but previously viewed static content resources could be at your fingertips whether you were in range of your Wi-Fi or not.
How Would Things Change?
With IPFS as a major player, things would definitely change. Although IPNS nodes can be mapped to HTTP addresses currently, they would not necessarily need to be forever. Web browsers might change, or be removed entirely. More likely, given the transition, you’d simply begin using multiple protocols to access content (instead of typing
http:// you might end up with several other protocols available in major browsers). These browsers would also need to be equipped with an automatic way to replace any locally cached content, if the node the browser attempts to contact has content that has been altered and is presenting a new hash.
Browsers, or other clients, might be the only necessary software. Remember that IPFS is peer to peer, so your IPFS installation is simply reaching out to locate others.
You also may wonder what happens with websites serving dynamic content. The answer here is far less clear. While updating static content and republishing to IPFS might not be such a challenge, dynamic, database-driven websites will be significantly more complicated. The challenge ahead will be for developers and proponents of the system to create not only viable, but also practical alternatives to cover these use cases, as a huge portion of the Web today is driven by dynamic database content. IPNS provides some potential solutions here, as do other services that are being developed, but a production-ready solution is yet to come.
The Future with IPFS
IPFS is definitely not a polished, well-oiled machine yet. It’s more of a fascinating prototype of what the Web could look like in coming years. The more people who test, contribute, and work to improve it, the greater chance it will have to change the way we serve content on the Internet as a whole. So get involved!
Download one of the prebuilt binaries, or source files, here , or check out the documentation, to get a little more information on the subject, and get started today!