by Markus Huber

Recently, academia and law enforcement alike have shown a strong demand for data that is collected from online social networks. We present a novel method for harvesting such data from social networking websites. Our approach uses a hybrid system based on a custom add-on for social networks in combination with a web crawling component.

Over recent years, online social networks (OSNs) have become the largest and fastest growing websites on the Internet. OSNs, such as Facebook or LinkedIn, contain sensitive and personal data of hundreds of millions of people, and are integrated into millions of other websites. Online social networks continue to replace traditional means of digital storage, sharing, and communication. Collecting this type of data is thus an important problem in the area of digital forensics. While traditional digital forensics is based on the analysis of file systems, captured network traffic or log files, new approaches for extracting data from social networks or cloud services are needed. Despite the growing importance of data from OSNs for research, current state of the art methods for data extraction seem to be mainly based on custom web crawlers.

Our approach is based on a hybrid system that uses an automated web browser in combination with an OSN third-party application. Our system can be used efficiently to gather ``social snapshots'', datasets which include user data and related information from the social network. The datasets that our tool collects contain profile information (user data, private messages, photos, etc.) and associated meta-data (internal timestamps and unique identifiers). We implemented a prototype for Facebook and evaluated our system on a number of volunteers.

Figure 1: Collection of digital evidence through our social snapshot framework
Figure 1: Collection of digital evidence through our social snapshot framework



Figure 1 shows the core applications of our social snapshot framework. (1) The social snapshot client is initialized by providing the target user's credentials or cookie. Our tool then starts the automated browser with the given authentication mechanism. (2) The automated browser adds our social snapshot application to the target user's profile and sends the shared API secret to our application server. (3) The social snapshot application responds with the target's contact list. (4) The automated web browser requests specific web pages of the user's profile and her contact list. (5) The received crawler data is parsed and stored. (6) While the automated browser requests specific web pages our social snapshot application gathers personal information via the OSN API. (7) Finally the social data collected via the third-party application are stored on the social snapshot application server.

In order to get access to the complete content of a target’s social network account, social snapshots depend on gathering the initial authentication token. Below we outline three digital forensic scenarios, representative of real-world use cases, which illustrate the initial gathering of the authentication token.

Consent: This naive approach requires consent from the person whose social networking profiles are analysed. A person would provide the forensic investigator temporary access to her social networking account in order to create a snapshot. This would also be the preferred method for academic studies to conduct this research ethically. and to comply with data privacy laws.

Hijack social networking sessions: Our social snapshot application provides a module to hijack established social networking sessions. An investigator would monitor the target’s network connection for valid authentication tokens, for example unencrypted WiFi connections or LANs. Once the hijack module finds a valid authentication token, the social snapshot application spawns a separate session to snapshot the target user’s account.

Extraction from forensic image: Finally, physical access to the target’s personal computer could be used to extract valid authentication cookies from web-browsers. Stored authentication cookies can be automatically found searching a gathered hard drive image or live analysis techniques.

Social snapshots explore novel techniques for automated collection of digital evidence from social networking services. Compared with state-of-the-art web crawling techniques our approach significantly reduces network traffic, is easier to maintain, and has access to additional and hidden information. Extensive evaluation of our techniques has shown that they are practical and effective in collecting the complete information of a given social networking account reasonably fast and without detection from social networking providers. We believe that our techniques can be used in cases where no legal cooperation with social networking providers exists. In order to provide a digital evidence collection tool for modern forensic investigations of social networking activities, we release our core social snapshot framework as open source software. We will continue to extend the analysis capabilities of our forensic software and cooperate with partners on the evaluation of real-world cases.

The research was funded by COMET K1, FFG - Austrian Research Promotion Agency, by the Austrian Research Promotion Agency under grants: 820854, 824709, and 825747. Further details of this work can be found in our previously published paper [1].

Links:
http://www.sba-research.org/socialsnapshots
https://github.com/markushuber/social-snapshot-tool

References:
[1] Markus Huber, Martin Mulazzani, Manuel Leithner, Sebastian Schrittwieser, Gilbert Wondracek, and Edgar Weippl. 2011. Social snapshots: digital forensics for online social networks. In Proceedings of the 27th Annual Computer Security Applications Conference (ACSAC '11). ACM, New York, NY, USA, 113-122. http://www.sba-research.org/wp-content/uploads/publications/social_snapshots_preprint.pdf

Please contact:
Markus Huber
SBA Research (AARIT), Austria
E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

{jcomments on}
Next issue: January 2025
Special theme:
Large-Scale Data Analytics
Call for the next issue
Get the latest issue to your desktop
RSS Feed