by Steffen Lüdtke, Roman Kraus and Martin Schneider (Fraunhofer FOKUS)
The growing number and diversity of cybersecurity attacks pose a challenge for developing secure systems, particularly in an age where many systems are connected to the internet. To facilitate early vulnerability detection, we propose an interactive fuzzing technique which employs grammars and supports genetic algorithms to interact with the SUT. This technique can generate inputs for specific attack scenarios, which are useful for finding new vulnerabilities and for assessing the completeness of patches. The proposed technique found zero-day vulnerabilities in established MQTT brokers within a few minutes.
Fuzzing and Genetic Algorithms
Fuzzing is a testing technique where a system under test (SUT) is confronted with randomly generated inputs to test for its robustness and security. Grammar-based fuzzing is a form of fuzzing which uses grammars to support the input generation [1]. Grammars can specify the structure of an input format, messages and message sequences and thus raise the probability of creating syntactically valid inputs and interactions with the SUT. This increases the likelihood of testing deeper functionalities compared to fully random input generation. Nevertheless, the random nature of fuzzing makes it difficult to efficiently generate inputs which fulfil a specific goal, e.g. inputs which particularly stress the resources of a SUT. Therefore, inputs for a given attack scenario (e.g. denial-of-service attacks) cannot be efficiently generated.
Genetic algorithms are an optimisation technique inspired by natural evolution. The idea is to start with a random set of solutions for a problem and to improve it through an iterative process of selection, recombination, and mutation. In the selection process, a subset of individuals is chosen based on a problem-specific fitness function to create the next generation (e.g. the top 30%). Their traits are then exchanged (recombination) and mutated to populate the next generation. This initially keeps advantageous traits and then develops them via recombination and mutation.
Genetic algorithms have been previously employed with fuzzing to, e.g. increase the code coverage. We employ genetic algorithms to create inputs and optimise them towards user-defined vulnerability indicators. This allows us to efficiently create inputs for specific attack types, which can be useful for finding new vulnerabilities and for assessing the quality of patches. We achieve this by specifying observable symptoms of a successful attack on the SUT instead of specifying the characteristics of the inputs, which are often more difficult to determine in advance.
MQTT and Mosquitto
Our case study is the Eclipse Mosquitto broker [L1]. This is a broker for the MQTT protocol, which is a prominent protocol for data exchange (e.g. telemetry) in the Internet of Things (IoT). MQTT implements a publish-subscribe model. This means that clients can publish messages under a specific topic to a broker which then forwards these messages this message to clients which have subscribed to that topic. Eclipse Mosquitto is a popular MQTT broker, which is implemented in C. It is therefore potentially vulnerable to memory errors like use-after-frees, segmentation faults or memory leaks. Mosquitto’s wide usage, which includes industry settings, makes it a relevant case study.
Architecture
Figure 1 provides an overview of our fuzzing architecture. The test case generation is done by Fuzzino [L2], a fuzzing library developed by Fraunhofer FOKUS. It receives as input ABNF grammars, which describe individual MQTT message types and valid message sequences. Fuzzino uses these grammars to derive test cases, which each constitute one sequence of MQTT messages. Within the grammars we use a custom meta language, which models relationships that normally cannot be expressed with context-free grammars. For example, we use symbols which are resolved to dynamically calculated length fields or which define implicit relationships between symbols (e.g. between a password flag and a password field). This increases the validity of inputs we generate and thus increases the scope of functionalities we can test.
Figure 1: The fuzzing architecture that includes Fuzzino, the supervisor for the SUT and the SUT, showing the interaction between these components.
Once Fuzzino has derived an initial generation of test cases, it starts executing them against the MQTT broker, e.g. Mosquitto. For this, it first contacts a supervisor that starts and stops the MQTT broker before and after each test case. This is necessary to match findings to individual test cases and to detect errors which only become evident on termination (e.g. memory leaks). We collect vulnerability metrics for each executed test case. This might be, e.g. the time which the broker needs to respond or the code coverage we achieve. Some of these metrics can be measured directly (e.g. the response time), while others require additional monitoring systems or instrumentations. Memory errors for instance are detected with the help of AddressSanitizer, which is an instrumentation that monitors memory accesses at runtime. The recorded metrics and findings are reported back to Fuzzino via the supervisor. Fuzzino then uses this data to calculate its fitness scores, produce the next generation and repeat this process until a predefined number of generations is reached.
Evaluation
Our evaluation on Mosquitto suggests that our approach can be effective at systematically increasing targeted vulnerability indicators compared to fuzzing without genetic optimisation, e.g. in terms of the response time. Moreover, it could also be used to support the development of patches by producing diverse inputs for a vulnerability to ensure that the patch inhibits all ways a vulnerability may be triggered. For this, we use a fitness function which targets locations that trigger the vulnerability while also rewarding coverage diversity.
Furthermore, unguided, i.e. non-genetic fuzzing, was also effective. Namely, it found a bug within the current release of Mosquitto (2.0.18) in just a few minutes, which was confirmed to be a zero-day vulnerability. Furthermore, it found two unknown bugs on NanoMQ [L3], another MQTT broker we analysed. This highlights that unguided fuzzing can be especially useful if you want to broadly search for new vulnerabilities without specific attack scenarios or metrics in mind.
Conclusion
The combination of grammar-based and genetic fuzzing is a powerful technique which can support testing systems with specific attack scenarios in mind. This might be the simple optimisation towards certain vulnerability indicators or more complex scenarios where a test suite is derived based on a specific vulnerability. Unguided fuzzing can complement this approach as a technique to analyse the security of a system without other goals in mind.
Links:
[L1] https://mosquitto.org
[L2] https://www.fokus.fraunhofer.de/de/sqc/security_testing
[L3] https://nanomq.io
Reference:
[1] S. Lüdtke, et al., “Attack-based automation of security testing for IoT applications with genetic algorithms and fuzzing,” IEEE QRS, doi: 10.1109/QRS-C55045.2021.00023
Please contact:
Martin Schneider, Fraunhofer FOKUS, Berlin, Germany
Roman Kraus, Fraunhofer FOKUS, Berlin, Germany