Decentralized Storage and Publication with IPFS and Swarm
In this article, we outline two of the most prominent solutions for decentralized content publication and storage. These two solutions are IPFS (InterPlanetary File System) and Ethereum’s Swarm.
With the advent of blockchain applications in recent years, the Internet has seen a boom of decentralization. The developer world has suddenly gotten the sense of the green pastures that lie beyond the existing paradigm, based on the server–client model, susceptible to censoring at the whims of different jurisdictions, cloud provider monopolies, etc.
Turkey’s ban of Wikipedia and The “Great Firewall of China” are just some examples. Dependence on internet backbones, hosting companies, cloud providers like Amazon, search providers like Google — things like these have betrayed the initial internet promise of democratization of knowledge and access to information.
As this article on TechCrunch said two years ago, the original idea of the internet was “to build a common neutral network which everyone can participate in equally for the betterment of humanity”. This idea is now reemerging as Web 3.0, a term that now means the decentralized web — an architecture that is censorship proof, and without a single point of failure.
As Gavin Wood, one of Ethereum’s founders, in his 2014 seminal work on Web 3.0 put it, there is “increasing need for a zero-trust interaction system”. He named the “post-Snowden web”, and described four components to it: “static content publication, dynamic messages, trustless transactions and an integrated user-interface”.
Decentralized Storage and Publication
Before the advent of cryptocurrency — and the Ethereum platform in particular — we had other projects that aimed to develop distributed applications.
- Freenet: a peer to peer (p2p) platform created to be censorship resistant — with its distributed data store — was first published in 2000.
- Gnutella network: enabled peer-to-peer file sharing with its many client incarnations.
- BitTorrent: was developed and published as early as 2001, and Wikipedia reports that, in 2004, it was “responsible for 25% of all Internet traffic”. The project is still here, and is technically impressive, with new projects copying its aspects — hash-based content addressing, DHT distributed databases, Kademlia lookups …
- Tribler: as a BitTorrent client, it added some other features for users, such as onion routed p2p communication.
Both of our aforementioned projects build on the shoulders of these giants.
IPFS
The InterPlanetary File System was developed by Juan Benet, and was first published in 2014. It aims to be a protocol, and a distributed file system, replacing HTTP. It’s a mixture of technologies, and it’s pretty low level — meaning that it leaves a lot to projects or layers built on top of it.
An introduction to the project by Juan Benet from 2015 can be found in this YouTube video.
IPFS aims to offer the infrastructure for reinventing the Internet, which is a huge goal. It uses content addressing — naming and lookup of content by its cryptographic hash, like Git, and like BitTorrent, which we mentioned. This technique enables us to ensure authenticity of content regardless of where it sits, and the implications of this are huge. We can, for example, have the same website hosted in ten, or hundreds of computers around the world — and load it knowing for sure that it’s the original, authentic content just by its hash-based address.
This means that important websites — or websites that may get censored by governments or other actors — don’t depend on any single point, like servers, databases, or even domain registrars. This, further, means that they can’t be easily extinguished.
The Web becomes resistant.
One more consequence of this is that we don’t, as end users, have to depend on internet backbones and perfect connectivity to a remote data center on another continent hosting our website. Countries can get completely cut off, but we can still load the same, authentic content from some machine nearby, still certain of its authenticity. It can be content cached on a PC in our very neighborhood.
With IPFS, it would become difficult, if not impossible, for Turkey to censor Wikipedia, because Wikipedia wouldn’t be relying on certain IP addresses. Authentic Wikipedia could be hosted on hundreds or thousands of local websites within Turkey itself, and this network of websites could be completely dynamic.
IPFS has no single point of failure, and nodes don’t need to trust each other.
Addressing the content is algorithmic — and it becomes uncensorable. It also improves the efficiency. We don’t need to request a website, or video, or music file from a remote server if it’s cached somewhere close to us.
This can eliminate request latency. And anyone who’s ever optimized website speed knows that network latency is a factor.
By using the aforementioned Kademlia algorithm, the network becomes robust, and we don’t rely on domain registrars/nameservers to find content. Lookup is built into the network itself. It can’t be taken down. Some of the major attacks by hackers in recent years were attacks on nameservers. An example is this particular attack in 2016, which took down Spotify, Reddit, NYT and Wired, and many others.
IPFS is being developed by Protocol Labs as an open-source project. On top of it, the company is building an incentivization layer — Filecoin — which has had an initial coin offering in Summer 2017, and has collected around $260 million (if we count pre-ICO VC investment) — perhaps the biggest amount collected by an ICO so far. Filecoin itself is not at production-stage yet, but IPFS is being used by production apps like OpenBazaar. There’s also IPFS integration in the Brave browser, and more is coming …
The production video-sharing platform d.tube is using IPFS for storage, while Steemit is using it for monetization, voting, etc.
It’s a web app that’s waiting for wider adoption, but it’s currently in production stage, and works without ads.
Although IPFS is considered an alpha-stage project, just like Swarm, IPFS is serving real-world projects.
Other notable projects using IPFS are Bloom and Decentraland — an AR game being built on top of the Ethereum blockchain and IPFS. Peerpad is an open-source app built to be used as an example for developers developing on IPFS.
Swarm
According to Viktor Tron, of the Ethereum Foundation, “basically, Swarm is BitTorrent on steroids”.
Swarm, by Ethersphere, aims to solve the same problems as IPFS. According to its GitHub page —
Swarm is a distributed storage platform and content distribution service, a native base layer service of the Ethereum Web 3 stack. The primary objective of Swarm is to provide a sufficiently decentralized and redundant store of Ethereum’s public record, in particular to store and distribute Đapp code and data as well as block chain data.
Viktor Tron is currently behind Swarm as its lead developer. He was one of the first employees of the Ethereum Foundation. Ethereum Foundation is funding the project development, along the lines of Gavin Wood’s vision of Web 3.0 that we quoted. So, Swarm is more integrated with the Ethereum ecosystem, and along with Whisper and Ethereum Virtual Machine, it’s aiming to build a next-generation platform for distributed apps, or Đapps.
Swarm is in an earlier stage of development than IPFS. To quote Viktor Tron —
IPFS is much further along in code maturity, scaling, adoption, community engagement and interaction with a dedicated developer community.
Once Swarm becomes production-ready, it will provide an incentivization layer and integration with Ethereum’s smart contracts, which should give plenty of room for creativity and innovative applications.
Neither the incentivization layer of Swarm nor of IPFS (Filecoin) are currently ready for use.
Note: at the time of writing (May 2018), Swarm’s lead developer has announced the release of POC3, which keeps its roadmap on the clock, and gives reasons for optimism regarding Swarm becoming production-ready in 2018.
While IPFS aims to build a protocol, and is a lower-level, more generic project, Swarm ties into the Ethereum’s Web 3 vision, with more focus on censorship resistance: it “implements plausible deniability with implausible accountability through a combination of obfuscation and double masking”.
This reminds us of the Freenet project, where those hosting certain content don’t necessarily have access to it, or know what it is.
Swarm, with its incentivization mechanisms, is aiming to provide higher level solutions. It —
exploits the full capabilities of smart contracts to handle registered nodes with deposit to stake. This allows for punitive measures as deterrents. Swarm provides a scheme to track responsibilities making storers individually accountable for particular content.
Compared to IPFS, Swarm has a lot of focus on these mechanisms. On the one hand, this includes incentives for long-term storage of not-so-popular content, and on the other, incentives for highly popular, high-bandwidth content. These two require two different approaches to penalties/rewards.
In Swarm’s case, this requires working on cryptographic constructs known as Proof-of-Custody, which make it possible “to have a compact proof, proving to any third party that you store a blob of data without transferring the whole data and without revealing the actual contents”. So proving a storage of some content doesn’t require the full download of that content every time.
Swarm even has an Accounting Protocol, SWAP, currently in development, as one level of incentivization.
Currently, before incentivization mechanisms are published, which is expected to happen in 2018, Swarm functions like a cache: less popular content can get deleted, and there’s no insurance against that.
Swarm will be usable as cloud hosting, while IPFS relegates this to projects that will be built on its infrastructure. IPFS leaves it to the implementors/developers to find the actual storage devices.
IPFS itself, as lower layer, has no guarantees of storage. While Swarm includes this in its roadmap, the IPFS team, in comparison, plans this on the Filecoin level, but it’s just in idea stage at the moment.
There’s a two-part YouTube interview with Tron where he explains the Swarm project in less technical terms:
There are two projects that build further on IPFS and Swarm that are worth mentioning in the context of Đapps: distributed applications. Since both projects allow for only a limited level of dynamic content, database-oriented projects built on top of these distributed systems add significant value.
OrbitDB is a “serverless, distributed, peer-to-peer database” that uses IPFS for its data storage.
It’s a database that works both for Node.js and in browsers. Its development is active, and is being sponsored by Protocol Labs. After its $260 million fundraising in 2017, the future of OrbitDB — just like that of IPFS — looks promising.
OrbitDB is part of the Node.js/npm ecosystem.
Wolk is a project/token that’s building a database — SWARMDB — using Swarm’s codebase. Behind it is a Californian startup, Wolk Inc., that managed to raise around 7,100 ETH in its ICO in 2017. WOLK promises a censorship-resistant distributed database powered with WLK token as its incentivizing layer. It provides a Go, Node.js and HTTP interface.
They claim Swarm and Bancor as their partners.
While it’s hard to predict success and adoption of these projects, or ascertain their quality, as IPFS and Swarm progress and become more production-ready and reach wider adoption, it’s pretty certain we’ll see more projects like these.
Swarm’s Orange Paper is an interesting, albeit a very technical read.
A longer comparison of the two projects can be found here.
Commonalities
Things that both IPFS and Swarm share are hash-based content addressing, which we described before. And while this provides git-level version control of the content, hosted on both systems, and censorship-resistance, deleting the content is something that remains to be solved.
Immutability provides guarantees of authentic content, but changes to the content produce new addresses, so to provide editing capability, additional layers are necessary.
From the perspective of different web apps, both projects support only static content. So, there’s no back-end apps with interpreted languages, like PHP, Python, Ruby, or Node.js. For Swarm, this is where EVM comes into play, but EVM also has its own inherent limitations.
Conclusion
Both IPFS and Swarm are promising projects, although one can’t help but wonder if the developers have set overly ambitious goals. If they succeed with their development roadmaps, and achieve wider adoption, there’s no doubt this will bring big changes to the Internet as we know it.