Lecture: Identifying Multi-Binary Vulnerabilities in Embedded Firmware at Scale
Low-power, single-purpose embedded devices (e.g., routers and IoT devices) have become ubiquitous. While they automate and simplify many aspects of our lives, recent large-scale attacks have shown that their sheer number poses a severe threat to the Internet infrastructure, which led to the development of an IoT-specific cybercrime underground. Unfortunately, the software on these systems is hardware-dependent, and typically executes in unique, minimal environments with non-standard configurations, making security analysis particularly challenging. Moreover, most of the existing devices implement their functionality through the use of multiple binaries. This multi-binary service implementation renders current static and dynamic analysis techniques either ineffective or inefficient, as they are unable to identify and adequately model the communication between the various executables.
In this talk, we will unveil the inner peculiarities of embedded firmware, we will show why existing firmware analysis techniques are ineffective, and we will present Karonte, a novel static analysis tool capable of analyzing embedded-device firmware by modeling and tracking multi-binary interactions. Our tool propagates taint information between binaries to detect insecure, attacker-controlled interactions, and effectively identify vulnerabilities.
We will then present the results and insights of our experiments. We tested Karonte on 53 firmware samples from various vendors, showing that our prototype tool can successfully track and constrain multi-binary interactions. In doing so, we discovered 46 zero-day bugs, which we disclosed to the responsible entities. We performed a large-scale experiment on 899 different samples, showing that Karonte scales well with firmware samples of different size and complexity, and can effectively and efficiently analyze real-world firmware in a generic and fully automated fashion.
Finally, we will demo our tool, showing how it led to the detection of a previously unknown vulnerability.
1. Introduction to IoT/Embedded firmware [~7 min]
* A brief intro to the IoT landscape and the problems caused by insecure IoT devices.
* Overview of the peculiarities that characterize embedded firmware.
* Strong dependence from custom, unique environments.
* Firmware samples are composed of multiple binaries, in a file system fashion (e.g., SquashFS).
* Example of how a typical firmware sample looks like.
2. How to Analyze Firmware? [~5 min]
* Overview on the current approaches/tools to analyze modern firmware and spot security vulnerabilities.
* Description of the limitations of the current tools.
* Dynamic analysis is usually unfeasible, because of the different, customized environments where firmware samples run.
* Traditional, single-binary static analysis generates too many false positives because it does not take into account the interactions between the multiple binaries in a firmware sample.
3. Modeling Multi-Binary Interactions [~5 min]
* Binaries/processes communicate through a finite set of communication paradigms, known as Inter-Process Communication (or IPC) paradigms.
* An instance of an IPC is identified through a unique key (which we term a data key) that is known by every process involved in the communication.
* Data keys associated with common IPC paradigms can be used to statically track the flow of attacker-controlled information between binaries.
4. Karonte: Design & Architecture [~15 min]
* Our tool, Karonte, performs inter-binary data-flow tracking to automatically detect insecure interactions among binaries of a firmware sample, ultimately discovering security vulnerabilities (memory-corruption and DoS vulnerabilities). We will go through the steps of our approach.
* As a first step, Karonte unpacks the firmware image using the off-the-shelf firmware unpacking utility binwalk.
* Then, it analyzes the unpacked firmware sample and automatically retrieves the set of binaries that export the device functionality to the outside world. These border binaries incorporate the logic necessary to accept user requests received from external sources (e.g., the network), and represent the point where attacker-controlled data is introduced within the firmware itself.
* Given a set of border binaries, Karonte builds a Binary Dependency Graph (BDG) that models communications among those binaries processing attacker-controlled data. The BDG is iteratively recovered by leveraging a collection of Communication Paradigm Finder (CPF) modules, which are able to reason about the different inter-process communication paradigms.
* We perform static symbolic taint analysis to track how the data is propagated through the binary and collect the constraints that are applied to such data. We then propagate the data with its constraints to the other binaries in the BDG.
* Finally, Karonte identifies security issues caused by insecure attacker-controlled data flows.
5. Evaluation & Results [~10 min]
* We leveraged a dataset of 53 modern firmware samples to study, in depth, each phase of our approach and evaluate its effectiveness to find bugs.
* We will show that our approach successfully identifies data flows across different firmware components, correctly propagating taint information.
* This allowed us to discover potentially vulnerable data flows, leading to the discovery of 46 zero-day software bugs, and the rediscovery of another 5 n-days bugs.
* Karonte provided an alert reduction of two orders of magnitude and a low false-positive rate.
* We performed a large-scale experiment on 899 different firmware samples to assess the scalability of our tool. We will show that Karonte scales well with firmware samples of different size and complexity, and thus can be used to analyze real-world firmware.
6. Demo of Karonte [~5 min]
* We will show how Karonte analyzes a real-world firmware sample and detects a security vulnerability that we found in the wild.
* We will show the output that Karonte produces and how analysts can leverage our tool to test IoT devices.
7. Conclusive Remarks [~3 min]
* A reprise of the initial questions and summary of the takeaways.