by Klaus Kieseberg, Peter Kieseberg and Edgar Weippl: (SBA Research)
Underground marketplaces, which represent one of the most prominent examples for criminal activities in the Darknet, form their own economic ecosystems, often connected to cyber-attacks against critical infrastructures. Retrieving first-hand information on emerging trends in these ecosystems is thus of vital importance for law enforcement, as well as critical infrastructure protection.
The term “Darknet” generally describes “overlay networks” that are only accessible to a few exclusive users, but it is often used in order to describe parts of the internet that are sealed off like underground marketplaces, or closed peer-to-peer networks. These networks, and their potential links to criminal and terrorist activities, have recently gained public attention, which has highlighted the need for an efficient analysis of Darknets and similar networks.
We intend to study how these underground forums operate as a means for unobserved communication between like-minded individuals as well as a tool for the propagation of political propaganda and recruitment. We also focus heavily on Darknets used for trading illegal goods and especially services that could be used to attack government institutions and undermine national security. For example, in some of these networks it is possible to buy the services of bot networks for launching “Distributed Denial of Service” (DDoS) attacks against sensitive infrastructures like power distribution networks, as well as physical goods like drugs and arms. An efficient analysis of these underground marketplaces is therefore essential for the prevention of terrorist attacks and to stem the proliferation of digital weapons.
In the course of our research, which notably focused on trend analysis in underground marketplaces, the following three key issues emerged that require special attention:
- Detection and analysis of data sources: In order to get a good database for subsequent analysis, a detailed source analysis and source detection regarding propaganda and illegal services, as well as an assessment of sources regarding their relevance concerning national security is required. This also contains means for the undetected automation of data collection, in order to get undistorted information and to not compromise the information gathering process.
- Privacy preserving analysis of data: Due to the strict privacy requirements for data processing put forth by the “General Data Protection Regulation” (GDPR)  new techniques in privacy preserving machine learning have to be invented. In addition, techniques for monitoring access to the crawled data as well as methods for manipulation detection need to be developed.
- Studying the mode of operation of underground marketplaces: This includes the mechanisms of establishing first contact, pricing, payment and the transfer of goods, particularly goods that would require some sort of contact in the offline world . A special interest for securing critical infrastructures lies in the analysis of trends, rather than individual behaviour, as this is more interesting from a strategic point of view and far less problematic with respect to sensitive information.
In addition to the technical problems, the analysis of this type of information also opens up many important legal questions. Especially the new rules and regulations introduced by the GDPR (General Data Protection Regulation) and the national counterparts play a significant part, since they aim to ensure transparency of data processing procedures concerning personal information, as well as the possibility to delete data from data processing applications.
When it comes to analysing underground marketplaces regarding the trade of illegal goods or services, it is important to develop techniques that can be automated to a certain degree but still integrate a human component to detect and evaluate criminal activities. Furthermore, many underground forums have detection mechanisms in place to identify automated information gathering, as well as users with strange access behaviour that hints at them belonging to law enforcement . Often, manual intervention in the form of messages with questions, as well as the analysis of access patterns are utilised by the forum owners. Thus, methods must be developed that mask the information gathering process and emulate user behaviour. To maximise the level of automation achievable, it is convenient to focus on the detection and identification of trends instead of investigating every individual case in isolation.
A vital aspect of this research work is to ensure anonymity and to secure the privacy of innocent people. We therefore strive to develop techniques and methods that allow for an efficient analysis of underground marketplaces while providing ample protection to people not involved in illegal activities. The collection of data will be a selective process adhering to the “data minimisation principle” introduced by the GDPR. Instead of collecting and analysing all available data, an intelligent collection process is required, where data is first evaluated regarding its importance and selected accordingly. This also includes metadata, which is of low importance for trend analysis itself, but can contain important information, as well as useful links between individual information particles. This not only increases the performance of the data collection process, but also minimises the amount of data collected while providing additional insights into the overall ecosystem of underground markets.
Another important question is how privacy protection mechanisms can affect conventional machine learning techniques. Methods used to protect sensitive information like anonymization of data via k-anonymity, as well as deleting individual records from data sets, can alter the information derived from the data analysis. While these are key factors in choosing the right protection method, as well as a suitable security factor, these effects have not yet been studied thoroughly enough . In the course of our work, we examine the effects of different protective measures on machine learning techniques and develop methods to mitigate them. Based on these results, we will develop means for controlling the introduced error by: (i) providing upper bounds for the effects of anonymization and deletion, as well as by (ii) developing new methods for analysing specific forms of anonymized sensitive information that introduce less distortion into the final result.
In conclusion, the analysis of underground marketplaces and other areas typically considered to constitute the Darknet requires additional research effort in order to deliver the information required for analysing trends and the workings of their market mechanisms. This led to the development of the “Darknet Analysis”-project [L1], which is currently tackling the research questions outlined above. Furthermore, the results of our research will make an important contribution to the topic of privacy protection in law enforcement as a whole and thus have high re-use value for the involved governmental stakeholders.
 Regulation (EU) 2016/679 Of The European Parliament And Of The Council Of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation)
 H. Fallmann, G. Wondracek, C. Platzer. “Covertly Probing Underground Economy Marketplaces”, DIMVA, Vol. 10, 2010.
 B. Malle, P. Kieseberg, E. R. Weippl, A. Holzinger: “The right to be forgotten: towards machine learning on perturbed knowledge bases”, in International Conference on Availability, Reliability, and Security (pp. 251-266), Springer, 2016.
SBA Research, Vienna, Austria