by Claude Castelluccia and Mohamed Ali Kaafar
The PLANETE research group at INRIA Grenoble in France is developing an 'owner-centric networking' architecture. This novel concept will considerably reduce 'data pollution' and improve privacy on the Internet.
In his essay (IEEE Security & Privacy, January/February 2009), Bruce Schneier famously said that data is the pollution of the Information Age. Content on the Internet (documents, emails, chats, images, videos etc) is often disseminated and replicated on different peers or servers. As a result, users lose the control and ownership of their content as soon as they release it.
The crux of the problem is that the Internet simply never forgets, and information that is posted lingers virtually forever. Furthermore, the design of the current Internet places no limit on data diffusion, nor any right to an individual to modify or remove what he/she wrote on a forum chat, or on a famous social network's walls.
This data pollution creates many privacy concerns, since this lost content can be used to collect information about users without their consent. For example, there have been several recent cases of employers using social networks (such as Facebook) to spy on their employees.
The Internet of the Future should solve these data pollution and privacy problems. However, according to Schneier, "Privacy isn't something that occurs naturally online, it must be deliberately architected". More specifically, we argue that the future Internet should give individuals control over their data. Users should be able to retrieve their previously posted content in order to withdraw or modify it. In other words, the Internet should enforce the 'right to forget', which is a constitutional law in several countries.
Unfortunately, most if not all future Internet architecture proposals seem to have ignored this issue so far. For example, the content-centric networking (CCN) architecture, which proposes that the focus be shifted from transmitting data by geographic location to disseminating it via named content, actually increases data pollution. In CCN, content is not only hosted by servers but also diffuses from its point of creation to where the consumers are. As a result, individuals completely lose control over their content as it becomes distributed (lost) on the Internet without their consent or even knowledge.
That said, we believe that content-centric networking is still a very attractive solution, if it then evolves towards an owner-centric architecture (OCN) that considers content ownership as bedrock. The OCN architecture that we are proposing comprises two main phases.
The first is content distribution and control. As in CCN, content diffuses to where the consumers are. However, in contrast to CCN, it is stored or cached in places controlled by owners and not by the network, such that it can easily be retrieved. These places should of course be defined according to the users' requests and needs, but be under the owner's control. All content belonging to an individual is under its owner's control; conceptually, it is as though a given individual's content is linked with a rope. Data are distributed and stored on the Internet, but at any time, users can pull on their ropes in order to retrieve them. Each time new content is added, a new element (a knot) is added to the rope. This rope can, for example, be implemented with a distributed hash table mechanism.
The second phase is content access. A user who wants to access content cannot download it (unless she owns it), but is only authorized to access it via a link, as occurs today when a user browses the Internet. As a result, instead of storing all the content locally, a user only stores the links to the content (except for that which she owns). As an illustration of this concept, an email becomes a collection of links on peers where the content of the email can be read. Similarly, a chat history is only composed of links to the actual conversation contents.
We believe that our OCN architecture would considerably reduce pollution on the Internet and improve privacy by giving users control over their data.
This project is still in its preliminary phase and we are aware that there are still many open issues to be solved. In particular, our scheme does not solve all privacy issues. For example, it does not prevent a service provider (such as search engines) from collecting information about its users for profiling or business purposes. These issues can be solved by integrating into this new architecture concepts borrowed from anonymizing networks, such as onion routing and hidden services.
Furthermore, our scheme is only an architecture proposal, and as such does not prevent malicious users from violating it; for instance, by copying contents instead of just accessing them remotely. However, we believe that this last issue can be mitigated with the help of security protocols (such as SPKI certificates) and enforced by laws. Note that the situation is very similar to environmental pollution. Technology can help to reduce environmental pollution but cannot enforce it. For example, no technology can prevent a boat from emptying its fuel tank in the ocean and polluting the water. Only legislation and law enforcement can help in such a case.
In conclusion, we believe that the new paradigms developed in this project should be the subject of more attention and debate within the community. We advocate that future Internet architecture proposals should consider data pollution from the beginning. Privacy on the Internet is probably as important as security, and deserves equal consideration.
Tel: +33 4 76 61 52 15
Mohamed Ali Kaafar
Tel: +33 4 76 61 55 95