___            __      __                         
-.     .-.   | __|(+) _ _ _ _\ \    / /(+) _ _ ___    .-.     .-
  \   /   \  | _|  | | '_| '  \ \/\/ /  | | '_/ -_)  /   \   /  
   '-'     '-|_|   | |_| |_|_|_\_/\_/   | |_| \___|-'     '-'   
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             

FirmWire

FirmWire is a full-system baseband firmware analysis platform that supports Samsung and MediaTek. It enables fuzzing, root-cause analysis, and debugging of baseband firmware images. See the FirmWire documentation to get started!

Experiments & Missing Parts?

Upon a vendor's request, the current public release of FirmWire is a preview version omitting some of the functionality described in the paper. We will publish the full version and automated scripts to replicate our experiments during NDSS'22 (April 24th-28th).

BibTeX

FirmWire is the result of a multi-year, cross university research effort. See the paper for more details.

If you are using FirmWire in an academic paper please use this to cite it:

@inproceedings{hernandez_firmwire_2022,
  title = {{FirmWire: Transparent Dynamic Analysis for Cellular Baseband Firmware}},
  shorttitle = {{FirmWire}},
  booktitle = {{ Symposium on Network and Distributed System Security (NDSS) }},
  author = {Hernandez, Grant and Muench, Marius and Maier, Dominik and Milburn, Alyssa and Park, Shinjo and Scharnowski, Tobias and Tucker, Tyler and Traynor, Patrick and Butler, Kevin R. B.},
  year = {2022}
}

Installation

The recommended way of using FirmWire is by using the supplied Dockerfile. To build the docker file, execute the following commands:

git clone https://github.com/FirmWire/FirmWire.git
cd FirmWire
git clone https://github.com/FirmWire/panda.git

# This will take some time
docker build -t firmwire .

Afterwards, you can obtain an interactive shell to a docker environment with FirmWire installed by executing:

docker run --rm -it -v $(pwd):/firmwire firmwire

From here, you can directly go to check out our quick start documentation to emulate your first modem!

Visual Studio Code

Alternatively to using docker from your commandline, you can also create a FirmWire environment using VScode, by using the devcontainer and docker extensions. After cloning FirmWire and FirmWire's version of Panda, just open the corresponding directory in code and execute: > Remote-Containers: Add Development Container Configuration Files Then, select From Dockerfile, which should automatically create a .devcontainer file. Afterwards, follow code's prompt to Reopen in container.

This will build the docker container and provide you an interactive shell inside the docker environment, with files transparently forwarded to the host directories. This is the favorite development setup for some of the FirmWire developers!

Manual Installation

The manual installation of FirmWire is a bit more tedious. Besides installing FirmWire and its requirement, you also need to:

  1. Manually build Panda
  2. Install PyPanda
  3. Manually build the FirmWire mods

For information on how to carry out these individual steps, please refer to the Dockerfile.

Quick Start

Have you installed FirmWire and are all eager to emulate your modem FirmWire? Very good! All you have to run after installation is:

$ ./firmwire.py modem.bin

This will automatically recognize the firmware, unpack it, and select a loader and machine to run it. You can also load firmware from a URL to get started:

$ ./firmwire.py https://github.com/grant-h/ShannonFirmware/raw/master/modem_files/CP_G973FXXU3ASG8_CP13372649_CL16487963_QB24948473_REV01_user_low_ship.tar.md5.lz4

Currently, FirmWire supports a subset of MediaTek MTK and Samsung Shannon firmware images.

Please note that FirmWire requires a couple different TCP ports for its operation. If you have any restrictions on which ports can be used, please use the --consecutive-ports flag to specify which ports can be used. For instance, if ports 10000-10005 are free to use on your system, invoke FirmWire as follows:

$ ./firmwire.py --consecutive-ports 10000 modem.bin

Supported Images

MediaTek

  • Samsung A10s (MT6762)
  • Samsung A41 (MT6768)

Shannon

  • Most images for Galaxy S7, S7e (S335)
  • Moto One Vision (S337)
  • Galaxy S8, S8+ (S355)
  • Galaxy S9 (S360)
  • Galaxy S10, S10e (S5000)

Using Ghidra

We have custom patches to Ghidra which are required if you are analyzing MediaTek firmware. See https://github.com/FirmWire/ghidra for setup instructions. For Shannon firmware see https://github.com/grant-h/ShannonBaseband#getting-started-with-shannon-firmware. You will need the ShannonLoader, which can be installed on to the custom Ghidra for MediaTek (or just use the upstream Ghidra).

Known Issues

  • MediaTek snapshotting is hacky. CCCI FSD has file system state that needs to be specially saved
  • After snapshotting, segfaults in Panda may occur. Just restore from snapshot to resume
  • Ctrl+C during console mode doesn't work. Use Ctrl+\

Technical Background

FirmWire is a baseband analysis platform. As input, it takes a baseband firmware image and tries to create an emulation environment for this image on-the-fly.

Emulation Core

The Emulation Core of FirmWire is built on top of avatar2 and PANDA. The core emulation capabilities are provided by PANDA, while avatar2 is used as middleware to orchestrate the execution state of the emulator, including spin-up, breakpoint registration, and starting/stopping of the emulation. Additionally, we use avatar2's Python Peripherals to implement peripherals which react on Memory-Mapped I/O accesses.

Under the hood, FirmWire implements vendor specific machines which use avatar2's PyPanda target to embed PANDA as dynamic library in the same process space as the Python Interpreter keeping the required inter-process communication for FirmWire to a bare minimum.

Emulator configuration

PANDA and avatar2 use the so-called configurable machine to enable emulation of arbitrary embedded systems with custom memory mappings. In essence, the embedded systems' memory map (including ROM, RAM, and peripherals) is described in a JSON file, which gets automatically generated by avatar2 based on individually registered memory ranges. This JSON file is then passed on to PANDA, which uses it to register and emulate the memory ranges accordingly.

Inside FirmWire, we use the configurable machine to create the emulation environments for the target baseband images on-the-fly. In more detail, our loader is responsible for parsing a binary firmware file and automatically extracting the required memory mappings, for instance by finding pre-defined MPU tables within the binary image.

This Manual

The rest of this manual will guide you through FirmWire from a user's perspective. In case you interested in developing or extending the core functionality of FirmWire, please stay tuned. Alternatively, you can dig through the source code, or reach out to us - we are happy to provide additional information wherever needed!

Command Line Interface Reference

This part of our documentation works as quick-reference to all the firmwire.py and firmwire_dev.py CLI arguments, and provides links about where they are covered. For more information about the single command line flags, you can also run FirmWire with the --help flag.

firmwire.py arguments

ArgumentCovered inDescription
modem_fileGetting StartedThe modem file FirmWire shall create an emulation environment for. Only mandatory argument(!)
--consecutive-ports CONSECUTIVE_PORTSGetting StartedChoose consecutive ports for the any listening sockets (e.g. QEMU's GDB & QMP), starting with the port provided.
-h/--helpCLI referenceShow help for for different cli flags on commandline
-w/--workspace WORKSPACEWorkspacesPath to the workspace to use
--snapshot-at SNAPSHOT_ATWorkspacesAddress and name for taking a snapshot. (Syntax: address,name)
--restore-snapshot SNAPSHOT_NAMEWorkspacesName of snapshot to be restored
-t/--module INJECTED_TASKModkitModule / Task to be injected to the baseband modem
-S/--stopInteractive explorationStop CPU after initializing the Machine. Useful for interactive exploration.
-s/--gdb-serverInteractive explorationStart GDB server on TCP port. Default is 1234. NOTE: this is a minimal GDB stub.
--consoleInteractive explorationSpawn an ipython remote kernel that can be connected to from another terminal using jupyter console --existing
--fuzz FUZZFuzzingInject and invoke the passed AFL fuzz task module (headless).
--fuzz-input FUZZ_INPUTFuzzingPath the AFL test case (@@ should be sufficient) or just the path to a single test file.
--fuzz-triage FUZZ_TRIAGEFuzzingInvoke the fuzzer, but without an AFL front end. Enables debug hooks and saves code coverage.
--fuzz-persistent FUZZ_PERSISTENTFuzzingEnable persistent fuzzing with a loop count as the argument.
--fuzz-crashlog-dir FUZZ_CRASHLOG_DIRFuzzingFolder to which logs of all testcases (length testcase) for a crashing run in persistent mode
--fuzz-crashlog-replay FUZZ_CRASHLOG_REPLAYFuzzingReplay a persistent-mode crash trace written with fuzz-crashcase-dir.
--fuzz-state-addr-file FUZZ_STATE_ADDR_FILEFuzzingTextfile containing the hex-addresses of state-variables
--full-coverageFuzzingEnable full coverage collection (logs every executed basic block)
--shannon-loader-nv_data NV_DATATBD(Shannon only) Specify the NV_DATA to be used
--mtk-loader-nv_data NV_DATATBD(MediaTek only) Specify the NV_DATA to be used

Developer options

Note: These arguments are mostly useful for development and debugging. As of now, they are part of firmwire.py, but will be moved to a custom firmwire_dev.py interface to clearly distinguish developer and user features in a future iteration of FirmWire.

ArgumentCovered inDescription
--debugTBDEnable FirmWire debugging
--debug-peripheralTBDEnable debugging for specified peripheralas
--avatar-debugTBDEnable debug logging for Avatar2
--avatar-debug-memoryTBDEnable Avatar2 remote memory debugging (useful when Peripherals crash)
--unassigned-access-logTBDPrint log messages when memory accesses to undefined memory occur
--raw-asm-loggingTBDPrint assembly basic blocks as QEMU executes them. Useful for determining infinite loops.
--trace-bb-translationTBDPrint the address of each new Basic Block, useful to eval BBs reached during fuzzing.

Workspaces

FirmWire uses workspaces tied to the specific firmware file under analysis. These workspaces contain a variety of useful files, most notably logs emitted by the avatar2-orchestration, the configurable machine definition, and a qcow2-image used for FirmWire's snapshotting mechanism, as well as vendor-specific files and directories.

By default, FirmWire creates a workspace at the very same directory where the modem file is located at, but this behavior can be overriden via the -w/--workspace command line flag.

Snapshots

One of FirmWire's convenience features is snapshotting, which is implemented on top of QEMU. Besides storing the emulation machine state in QEMU's qcow2 image format, FirmWire also saves the state of used python peripherals in auxiliary .snapinfo files.

To take a snapshot use the --snapshot-at commandline argument or call the snapshot() method during interactive exploration. Presume you want to take a snapshot with the name my_first_snapshot at address 0x464d5752. For taking the snapshot from commandline, simply run ./firmwire.py --snapshot-at 0x464d5752,my_first_snapshot modem_file. When using interactive exploration, you will have directly access to the python machine object via self. Make sure to stop execution at the desired address (for instance by setting a breakpoint), and then execute: self.snapshot("my_first_snapshot"). Alternatively, if you don't want to manually steer execution, you can also use self.snapshot_state_at_address(0x464d5752, "my_first_snapshot").

For starting execution from this snapshot during the next start of FirmWire, all you will need to is ./firmwire.py --restore-snapshot my_first_snapshot modem_file. If you use interactive exploration, you can even restore snapshots on-the-fly, without the need to restart the emulator! In this case, you would need to execute self.restore_snapshot("my_first_snapshot")

PatternDB

PatternDB is a convienent way to define memory patterns which FirmWire uses to scan the binary baseband firmware during load-time. You you can think about FirmWire memory patterns as binary regexes tailored towards firmware analysis tasks. Once a pattern is found, FirmWire associates a symbol to the according pattern (in the simplest case), and, optionally executes lookup and post-lookup functions. The pattern itself are defined in the pattern.py-file present in the different vendor plugins.

PatternDB is used at various places inside FirmWire: For finding MPU-tables during load-time, automatically resolving logging functions, or exporting symbols to the Modkit, to just provide a few examples. At the time of FirmWire's public release, we provide 18 patterns for Shannon-based modems and 9 for MediaTek-based modems, tested on a variety of firmware images.

Pattern Syntax

In our paper, we formally defined the syntax for our pattern as follows:

Pattern := {
  name := string
  pattern := [ PatternSyntax... ]
  lookup := PatternFn?
  post_lookup := PatternFn?
  required := bool?
  for := [ string... ]?
  within := [ AddressSet... ]?
  offset := integer?
  offset_end := integer?
  align := integer?
}

PatternSyntax :=
r"([a-fA-F0-9]{2}|([?][?+*]))+"
PatternFn := code
AddressSet := SymbolName | AddressRange
SymbolName := string
AddressRange := [integer, integer]

But what does this actually mean? Let's consider the following pattern taken from Shannon's pattern.py:

   "boot_setup_memory" : {
        "pattern" : [
            "00008004 200c0000",
            "00000004 ????0100", 
        ],
        "offset" : -0x14,
        "align": 4,
        "post_lookup" : handlers.parse_memory_table,
        "required" : True,
    },

Here, we define two patterns which are used to create the PatternDB symbol boot_setup_memory, using hexadecimal notation of the searched bytes in little-endian encoding. Note that the second pattern includes ?? symbols - these are basically wildcards, and allows us to match against arbitrary bytes. Wildcard bytes specified with ?? allow for modifiers as known from regular regexes (pun intended!). ?+ requires the presence of one or more wildcard bytes, while ?* allows for zero or more wildcard bytes at the given location to result into a match.

Going back to our example pattern, the actual address associated with the boot_setup_memory symbol will be 0x14 before the location of the found pattern, as specified by the offset parameter. Alignement defines that the search granularity should be 4-bytes aligned and required will cause FirmWire to exit immediately in case this pattern is not found, as it is crucial for the generation of the emulation environment. Lastly, the post_lookup function takes a reference to a python function to be executed after the lookup completed. The function signature for this specific postlookup function is as follows:

def parse_memory_table(self, sym, data, offset): 

Here, self is a reference to the ShannonMachine, sym a reference to the PatternDB symbol, data the memory searched for, and offset the start offset for the search considering the virtual location of the data block. The patternDB symbol sym, on the other hand, contains information about address, name, and type of the symbol.

Pattern KeyWord Details

KeywordDescription
nameThe name of the pattern and the resulting symbol (string)
patternOne ore more memory patterns which will create the result on match.
lookupFunction to use instead of pattern. Parameters are the data block to be searched and the offset to start. Expected to return None or integer denoting the address.
post_lookupFunction to be executed after successful match. Parameters are described in example above. Expected to return True on success, else False.
requiredWhen set to True, FirmWire will not continue execution when no match is found.
forSpecify SoC version in case the symbol shall only be looked up for certain SoC versions.
withinOn an image with existing symbols, specify in which function to look for this pattern.
offsetOffset between matched pattern and address of created symbol.
offset_endSame as offset, but caluclate from the end of matched pattern, rather than from the start.
alignMemory aligment required for found matches.

Defining your own Pattern

You want to define your own pattern? Great! Just extend the pattern.py file in the corresponding vendor plugin to include your pattern, and it should be automatically scanned for during the next start of FirmWire.

Modkit

One of the core features of FirmWire is it's modkit, which allows to create and compile own modules and tasks to be injected in the emulated baseband image. The modkit serves as bases for our fuzzing modules, as well as the GuestLink interactive exploration capabilities.

In a nutshell, mods are C programs, which use the symbols created with patternDB and the vendor specific loaders to extend the functionality of the existing baseband firmware image. These C programs need to be pre-compiled by using Makefiles supplied by us. Then, FirmWire can inject these tasks during run time, automatically resolving the symbols and placing the task in an unused memory segment.

Toolchain & Compilation

To compile tasks, target specific compilation toolchains are required. For an Ubuntu 20.04 system, we had success with the following toolchains provided by the distribution's packet repository: gcc-9-mipsel-linux-gnu for MediaTek based firmware, and gcc-arm-none-eabi for Shannon baseband firmware.

After installing the toolchains, the modules can be compiled by browsing to the modkit directory and running make inside the vendor-specific subdirectories (i.e. mtk and shannon).

In case you want to extend the modkit and provide your own mod, you will need to adjust the Makefile. In particular, you need to modify the MODS line and provide the path to your mod's source. To exemplify this, let's assume you want to add mymod to the mods available for emulated Shannon modems.

Before modification, the relevant section in the Makefile should look something like this:

MODS := gsm_mm gsm_sm gsm_cc lte_rrc glink

gsm_mm_SRC := fuzzers/gsm_mm.c afl.c
gsm_cc_SRC := fuzzers/gsm_cc.c afl.c
gsm_sm_SRC := fuzzers/gsm_sm.c afl.c
lte_rrc_SRC := fuzzers/lte_rrc.c afl.c
glink_SRC := glink.c

Assuming you have your source code in mymod.c, this part of the Makefile should look as follows after modification:

MODS := gsm_mm gsm_sm gsm_cc lte_rrc glink mymod

gsm_mm_SRC := fuzzers/gsm_mm.c afl.c
gsm_cc_SRC := fuzzers/gsm_cc.c afl.c
gsm_sm_SRC := fuzzers/gsm_sm.c afl.c
lte_rrc_SRC := fuzzers/lte_rrc.c afl.c
glink_SRC := glink.c
mymod_SRC := mymod.c

After this tiny modifications, your mod should be compiled as well when running make!

Modkit format

To further exemplify how the modkit is used, let's look at a very basic task: The hello_world task for MTK basebands.

The source code for this task looks as follows:

#include <task.h>
#include <modkit.h>
#include <hello_world.h>

MODKIT_FUNCTION_SYMBOL(void, dhl_print_string, int, int, int, char *)

extern void real_main() {
    while(1) {
        dhl_print_string(2, 0, 0, "hello world\n");
    }
}

There is not a lot of code, isn't it? Let's go through the lines. The first include import the MediaTek task logic, which is required to make sure our code will be embedded correctly, following the baseband-specific task structure. Similarly, the second line includes high-level modkit functionalities. The third line includes hello_world.h, whose content are:

#ifndef HELLO_WORLD_H
#define HELLO_WORLD_H

const char TASK_NAME[] = "testtask\0";

#endif

The only important line here is the specification of the task name, which is set to testtask.

Coming back to hello_world.c, the fifth line is where things get interesting:

MODKIT_FUNCTION_SYMBOL(void, dhl_print_string, int, int, int, char *)

This directive is used to advise the modkit to "resolve" a function which is part of the original modem firmware. The general syntax for it is:

MODKIT_FUNCTION_SYMBOL(return_type, function_name, type_argument1, type_argument2, ..., type_argumentN)

After using this directive, the selected function becomes available to the C program, so in this case we can use dhl_print_string later in the code, which is used to provide logging output.

The next part of the code defines the real_main() function, which is used by the MediaTek modkit to assess where execution should start for this task (in the case of Shannon mods, the corresponding function name would be task_main). This main function does nothing else than using the resolved dhl_print_string function to print "Hello World" repeatedly to the console. Neat!

Running the task

Providing the code for the injected task is only the first step; of course, we also want to run it. Luckily, except running make, FirmWire automates the full process of injecting the task. Once build via make, we can easily invoke FirmWire with the -t/--module flag.

Taken the hello world task from above as example, this would look something like this:

$ python3 firmwire.py -t hello_world modem.bin
              ___            __      _                          
-.     .-.   | __|(+) _ _ _ _\ \    / /(+) _ _ ___    .-.     .-
  \   /   \  | _|  | | '_| '  \ \/\/ /  | | '_/ -_)  /   \   /  
   '-'     '-|_|   | |_| |_|_|_\_/\_/   | |_| \___|-'     '-'   
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~             
                A  baseband  analysis  platform
                   https://github.com/FirmWire
[INFO] firmwire.loader: Reading firmware using ShannonLoader (shannon)
[ERROR] firmwire.vendor.shannon.loader: Modem CP tarfile is missing required modem.bin
[ERROR] firmwire.loader: Loading failed!
[INFO] firmwire.loader: Reading firmware using MTKLoader (mtk)
[INFO] firmwire.vendor.mtk.loader: Found new file md1rom at 0x0/0x0 with length 0x169eca4

[...] (Loading informations)

[INFO] firmwire.vendor.mtk.machine: Resolved dynamic symbol dhl_print_string (0x90287e25) -> 0x9f4000a0 (FUNC)
No Memory range specified at 0x913d66e4
[WARN] firmwire.vendor.mtk.machine: Overwriting an existing task
[INFO] firmwire.vendor.mtk.machine: Injecting Task at 0x9f400000 (stack: 0x9f4010e0)
Injecting contents to 913d66e4: b'7000409f284b3d910001ff00001000003500409f0000000101000000ffffffff'
No Memory range specified at 0x913d66e4
After injecting task

[...] (Lots of Logs)

[49.46536][SSIPC] 0x90e353b7 [SSIPC][ILM_MSG] waiting msg

[49.46661][testtas] 0x9f400033 hello world

[...]

As we can see, FirmWire automatically resolved dhl_print_string and injected the hello world task, which then later printed it output to the commandline!

There are also other ways to invoke a mod, namely by using the --fuzz/--fuzz-triage or the --interactive commandline flag. More about these will be covered in the next Chapters!

Interactive Exploration

FirmWire has multiple ways to facilitate interactive exploration of the emulated baseband firmware. The reason for such exploration are various, ranging from aiding static reverse engineering over observing the baseband's behavior when receiving custom messages to root-cause analysis for crashing inputs.

GDB

The most classic way to interact with the emulated baseband is via GDB. Simply start FirmWire with the -s/--gdb-server flag while specifying a port number to start a gdb server! Then, you can start up your local gdb build (we recommend gdb-multiarch for Ubuntu 20.04) and connect to the emulated baseband by executing from within gdb:

target remote :PORT

Alternatively, when using gdb together with gef, we suggest to run the following for better usability instead:

gef-remote --qemu-mode 127.0.0.1:PORT

Once connected, you should be able to set breakpoints, inspect and modify memory, as well as steering execution just as usual. Under the hood, FirmWire spawns a gdb server provided by the corresponding avatar2 plugin. This allows to transparently access both the memory provided by avatar-backed memory ranges (as in the case of python peripherals), and emulated memory provided by PANDA.

NOTE: For ARM targets (Shannon), due to mixed ARM / Thumb2, you may need to set breakpoints at your target addresses +/- 1 byte. This is a limitation of QEMU's ARM gdbstub, causing breakpoints to be missed depending on the code's ISA mode.

What's more, via gdb's monitor command you have directly access to the Python context of avatar2 gdb server, and allows you to execute simple Python statements. You even can access the global avatar object from gdb by executing:

monitor self.avatar

However, if you are really eager to control FirmWire's, and the emulated baseband's, execution state from Python, we recommend using IPython as described below.

Console

FirmWire offers a second convenient way for controlling execution: a IPython/jupyter console interface. To invoke it, run FirmWire with the --console flag:

$ python3 firmwire.py modem.bin --console

Then, after initial FirmWire startup, you can connect to the console from the second terminal:

$ jupyter-console --existing

In here, you have full access to FirmWire's API, and your main interface to interact with the emulated baseband is its machine abstraction, which is directly accessible via self.

While explaining the full API and available functions would be to extensive for this guide, below are the most important objects and methods provided by firmwire's machine object.

ObjectDescription
self.avatarThe avatar object, providing information about memory ranges and python peripherals
self.loaderThe vendor specific loader, providing information about the loaded firmware image.
self.loader.symbol_tableThe symbols extracted by the loader, using patternDB and auxiliary information.
self.modkitHandle to FirmWire's modkit
self.pandaHandle to libpanda, the PANDA library with python bindings
self.qemuDirect handle to avatar's PyPanda target, the avatar wrapper around libpanda. Follows avatar2's target API.

To steer execution and inspecting memory, the self.qemu object will most likely be your bread and butter. It provides functions such as cont(), stop(), set_breakpoint(), read_memory() or write_memory() - more information can be found over in handbook for avatar2.

Besides these objects and capabilities provided by the PANDA and avatar2 frameworks, the FirmWire also provide a couple of additional methods to make exploration easier:

Method/SignatureDescription
self.get_tasks()Retrieves the RTOS tasks automatically identified by FirmWire in its abstract Python representation.
self.get_peripheral(name)Retrieve a handle to the Python peripheral with specified name.
self.restore_snapshot(snapshot_name)Restores Snapshot with given name.
self.run_for(t)If the machine is stopped, continue execution for t seconds.
self.snapshot(snapshot_name)Create snapshot of current execution state with given name.
self.set_breakpoint(address, handler)Set a breakpoint at specified address and execute the code provided in handler when hit.

For Shannon basebands, the interactive capabilities are further extended by the special GuestLink peripheral.

The GuestLink is a combination of custom task injected into the baseband and python peripheral, allowing for interaction with the emulated baseband. To make use of it, start FirmWire both with activated console and injected glink task:

./firmwire.py -t glink --console ./modem.bin

Now, after connecting to the console as described above, you can get a handle to the glink peripheral:

In [1]: gl = self.get_peripheral('glink')

This GLink peripheral can be controlled from Python and uses a MMIO range to communicated with the GLink task. More specifically, the MMIO range is organized as follows:

struct glink_peripheral {
  uint32_t access;
  uint32_t tx_head;
  uint32_t tx_tail;
  uint32_t rx_head;
  uint32_t rx_tail;
  uint8_t tx_fifo[TX_FIFO_SIZE];
  uint8_t rx_fifo[RX_FIFO_SIZE];
} ;

The access field is used to communicate return values from the GLink task back to the Python peripheral, while the rest are data structures for input and output FIFO buffers. These buffers use a simple packet-based data format for communication:

struct glink_cmd_header {
  uint8_t cmd;
  uint8_t len;
  // next field is variable amount of octets
};

Currently, GuestLinks implementation only allows for commands from Python to the emulated baseband. The available commands are:

CMDPythonAPIDescription
GLINK_SEND_QUEUE_INDIRgl.send_queue(True, src_qid_name, dst_qid_name, msg_group, payload)Sends specified message to baseband internal queue. Payload is provided as allocated memory chunk, to be free'd by the baseband.
GLINK_SEND_QUEUEgl.send_queue(False, src_qid_name, dst_qid_name, msg_group, payload)Sends specified message to baseband internal queue. Payload is inlined with the msg_struct.
GLINK_SET_EVENTgl.set_event(event)Sets the baseband internal event. event can either be int (event number) or bytes (event name).
GLINK_ALLOC_BLOCKgl.create_block(size)Allocate a chunk of memory of given size. The address of the chunk can later be retrieved via gl.access.
GLINK_CALL_FUNCgl.call_function(fn, args)Call function fn with args specified in args. fn must be of type int, and args a list of ints. The return code of the function can later be retrieved from gl.access.

Interactive exploration: Tips & Tricks

Stopping execution on startup

FirmWire supports to stop after initialization. This means, you can interactively step through the full boot process, or control execution in a fine-grained manner after restoring a snapshot. All you need to do is to supply the -S/--stop flag to FirmWire on the command line.

When using GuestLink, keep in mind that GLink acts fully asynchronously, i.e., when calling a function from Python, the according command is only written to the shared MMIO region. The GLink task in the baseband then has to parse and process the command before the result is available.

For better understanding, we provide a typical guestlink usage example below, allocating a block of size 0x100, and storing the result into chunk_addr:

In [1]: self.qemu.stop()
In [2]: gl = self.get_peripheral('glink')
In [3]: gl.create_block(0x100)
In [4]: self.run_for(0.5)
In [5]: chunk_addr = gl.access
In [6]: hex(chunk_addr)
Out[6]: '0x44f0293c'

When using GLink in combination with snapshots, it is important to note that a reference to the guest link peripheral does not propagate across snapshots. That means, after restoring a snapshot, a new reference to the GLink peripheral has to be required via get_peripheral, as shown below:

In [1]: gl = self.get_peripheral('glink')
In [2]: gl
Out[2]: <firmwire.hw.glink.GLinkPeripheral at 0x7f720c4f59d0>
In [3]: self.restore_snapshot("glink_demo")
In [3]: gl = self.get_peripheral('glink')
In [4]: gl 
Out[4]: <firmwire.hw.glink.GLinkPeripheral at 0x7f7219792850>`

Fuzzing

One of FirmWire's core contribution is the capability to fuzz the emulated baseband image using specialized fuzzing tasks. These tasks are created using our modkit, and use triforce-afl hypercalls to communicate with the fuzzer, AFL++.

This combination of injected tasks and hypercalls allows for transparent in-modem fuzzing: A fuzz task would get the input from the fuzzer and then send it as message to the targeted task. For the targeted task, the input received this way would look like benign input arriving over the usual channels.

FirmWire comes with some example fuzzing tasks, which were used in the evaluation of our paper. Let's look at one example task, to demonstrate how one would build a harness.

Example Harness: GSM CC

Below is the high-level overview of our gsm_cc harness for Shannon basebands:

#include <shannon.h>
#include <afl.h>

const char TASK_NAME[] = "AFL_GSM_CC\0";

static uint32_t qid;

int fuzz_single_setup()
{
    ...
}
void fuzz_single()
{
    ...
}

First, shannon.h is included to provide shannon specific convenience functions (e.g. uart_puts and pal_MemAlloc). Then, afl.h is included, which provides the main functionality and API for fuzzing. The API is a slightly modified version as the one given by TriforceAFL and provides following four functions:

FunctionPurpose
char * getWork(unsigned int *sizep)Returns a buffer with fuzzing input and stores the input size into sizep.
int startWork(unsigned int start, unsigned int end)Start a fuzzing execution, while collecting coverage for code residing between start and end.
int doneWork(int val)Mark the end of a fuzzing iteration, providing val as return code to the fuzzer.
int startForkserver(int ticks)Starts AFL forkserver. ticks controls whether qemu ticks should be enabled or not.

Besides this API, the afl.h/afl.c files also provides the basic skeleton for the fuzzing loop inside a task_main function:

void task_main() {
    [...]
    if (!fuzz_single_setup()) {
      uart_puts("[!] Fuzzer init error\n");
      for (;;) ;
    }
    uart_puts("[+] Fuzzer init complete\n");

    uart_puts("[+] Starting fork server\n");
    startForkserver(1, AFL_PERSISTENT_LOOP_CTX);

    while (1) {
      fuzz_single();
    }

As we can see, this logic requires two additional functions: fuzz_single_setup and fuzz_single, which both need to be provided by our harness. The first function is responsible for all task-specific setup. In the case of gsm_cc, this means (1) resolving the queueID for CC, (2) creating a qitem_cc memory chunk containing the correct msgGroup ID to initiate task initialization, and (3) sending the memory chunk as message to the according queue.

The full code for these three steps looks as below:

int fuzz_single_setup()
{
    qid = queuename2id("CC");

    struct qitem_cc * init = pal_MemAlloc(4, sizeof(struct qitem_cc), __FILE__, __LINE__);

    init->header.op = 0;
    init->header.size = 1;
    // 0x2a01 CC_INIT_REQ
    init->header.msgGroup = 0x2a01;
    pal_MsgSendTo(qid, init, 2);

    return 1;
}

When it comes to fuzz_single, this function is executed once per fuzzing iteration and is meant to forward the input from the fuzzer to the dedicated target task.

In case of gsm_cc, this includes the following steps:

  1. Create a memory chunk for the qitem_cc (just as above).
  2. Get fuzzing input from the fuzzer using getWork().
  3. Validate that the input size is within valid boundaries.
  4. Set up the qitem_cc to have the correct MessageGroup for RADIO_MSG types, as the contents for these are attacker controlled.
  5. Moving the received input into qitem_cc.
  6. Trigger the collection of coverage by calling startWork.
  7. Sending the set up message to the target tasks. This will invoke the scheduler and the fuzztask is only scheduled back in after the message was processed.
  8. Call doneWork to signalize the fuzzer that the input was processed, and the next iteration can start.

In code, this looks as follows:

void fuzz_single()
{
    uint32_t input_size;
    uint16_t size;

    uart_puts("[+] Allocating Qitem\n");
    struct qitem_cc * item = pal_MemAlloc(4, sizeof(struct qitem_cc) + AFL_MAX_INPUT, __FILE__, __LINE__);

    if (!item) {
      uart_puts("ALLOC FAILED");
      return;
    }

    uart_puts("[+] Getting Work\n");
    char * buf = getWork(&input_size);
    size = (uint16_t) input_size;
    // GSM radio messages are usually limited in size
    size = size > 512 ? 512 : size;

    uart_puts("[+] Received n bytes: ");
    uart_dump_hex((uint8_t *) &size, 4); // Print some for testing

    if (size < 3) {
      startWork(0, 0xffffffff); // memory range to collect coverage
      doneWork(0);
      return;
    }

    uart_puts("[+] Filling the qitem\n");
    item->header.op1 = 0xaa;
    item->header.op2 = 0x20;

    // Only target the RADIO_MSG msg types that get sent to the MM task
    item->header.msgGroup = 0x2a3c;

    item->header.size = size;

    memcpy(item->payload, buf, size);

    uart_puts("[+] FIRE\n");
    startWork(0, 0xffffffff); // memory range to collect coverage

    pal_MsgSendTo(qid, item, 2);
    doneWork(0);
}

Further examples on how to write fuzzing harnesses can be found by inspecting the source code of our other harnesses. Our lte_rrc fuzzer demonstrates for instance how a fuzzer would look like when the targeted task requires (a) an event to trigger message processing and (b) the input delivered in a separated memory chunk (rather than inlined in the qitem).

Controlling the fuzzing process

Writing the fuzzing harness is only the first step; the second is to actually start the fuzzer. FirmWire requires, at its minimum, two additional command line flags to facilitate fuzzing: --fuzz and --fuzz-input. The first one will cause FirmWire to be started in fuzzing mode. This disables console output, debugging hooks, and similar to achieve maximum performance during fuzzing. The latter flag advises FirmWire where it can find the current fuzzing input, and this is usually provided by AFL itself. A full command line for starting fuzzing, on the example of gsm_cc would look like this:

$ afl-fuzz -i in -o out -U -- ./firmwire.py --restore-snapshot fuzz_base --fuzz gsm_cc --fuzz-input @@ modem.bin

Assuming you have some seed inputs in the in directory, this command line should bring you directly to the AFL++ window. Note how we used a snapshot here? As the boot time of the modem is quite long, AFL++ would timeout without these snapshot. If you would like to fuzz without using the snapshot, we recommend to set the AFL_FORKSRV_INIT_TMOUT environment variable to a high value.

Persistent Mode

Besides the fuzz-mode, FirmWire provides another option for further improving fuzzing throughput: persistent-mode fuzzing. Instead of re-forking after every single fuzzing input, the emulator can process multiple inputs in a loop, re-forking only every N iterations. From a programming perspective, this basically just means that fuzz_single() is invoked multiple times, and FirmWire keeps track of how many inputs were processed before issuing a new fork.

To use persistent mode, the command line only needs to be extended with --fuzz-persistent N. This means, when wanting to fuzz for 1000 iterations before re-forking, the command line above needs to be modified as follows:

$ afl-fuzz -i in -o out -U -- ./firmwire.py --restore-snapshot fuzz_base --fuzz gsm_cc --fuzz-input @@ --fuzz-persistent 1000 modem.bin

To further improve persistent mode, we provide one additional feature: a persistent test case log. Usually, upon crash, only the last input is saved by AFL++, as it assumes that the target state between two fuzz iterations did not change. As baseband firmware is highly stateful, we cautiously violate this assumption during persistent fuzzing.

To not loose precious input during fuzzing which brings the baseband into different states, we also provide a --fuzz-crashlog-dir command line flag. The argument to this flag should point to a directory. Upon crash, all inputs used in the according persistent iteration are stored into a file within this given directory.

Replaying Inputs

During fuzzing, you may encounter some crashes or timeouts. But how to analyze them?

FirmWire brings a --fuzz-triage flag, which allows replaying of fuzzing inputs for a specific harness, while keeping logging output enabled. The following command line will replay a test case called crash.bin, located in the same directory:

$ ./firmwire.py --restore-snapshot fuzz_base --fuzz-triage gsm_cc --fuzz-input ./crash.bin modem.bin

Note that the fuzz-triage mode can also be coupled with different interactive capabilities (e.g., gdb) to facilitate root cause analysis.

Lastly, it is also possible to replay persistent crashlogs collected with --fuzz-crashlog-dir. This can be done by selecting the desired crashlog via --fuzz-crashlog-replay.

Happy Fuzzing!

Trophy Wall

FirmWire is intended as tool to find security critical bugs and to ease baseband specific research. As such, we are happy to showcase how FirmWire is used! On this page, you can find details to vulnerabilties found with FirmWire, talks about the framework, and blogposts describing its usage.

Vulnerabilities

So far, FirmWire was involved in finding the following vulnerabilities:

CVESeverityFinderDescription
CVE-2021-254797.2 (high)Team FirmWireA possible heap-based buffer overflow vulnerability in Exynos CP Chipset prior to SMR Oct-2021 Release 1 allows arbitrary memory write and code execution.
CVE-2021-254787.2 (high)Team FirmWireA possible stack-based buffer overflow vulnerability in Exynos CP Chipset prior to SMR Oct-2021 Release 1 allows arbitrary memory write and code execution.
CVE-2020-252799.8 (critical)Team FirmWireAn issue was discovered on Samsung mobile devices with O(8.x), P(9.0), and Q(10.0) (Exynos chipsets) software. The baseband component has a buffer overflow via an abnormal SETUP message, leading to execution of arbitrary code. The Samsung ID is SVE-2020-18098 (September 2020).
CVE-2021-254774.9 (medium)Team FirmWireAn improper error handling in Mediatek RRC Protocol stack prior to SMR Oct-2021 Release 1 allows modem crash and remote denial of service.

Talks

TitleWhereWhoLinksDescription
Emulating Samsung's Baseband for Security TestingBlackhat USA'20Team FirmWire (Grant & Marius)youtube slidesTalk about FirmWire's first steps (back then, it had the working title ShannonEE). Discusses the fundamental architecture of the framework.
Reversing & Emulating Samsung’s Shannon BasebandHardwaer.io NL'20Team FirmWire (Grant & Marius)youtube slidesTalk about the reverse engineering on Shannon-based modems which was required to build FirmWire.
FirmWire: Transparent Dynamic Analysis for Cellular Baseband FirmwareNDSS'22Team FirmWire (Grant)TBDAcademic presentation of the FirmWire paper.
FirmWire: Taking Baseband Security Analysis to the Next LevelCanSecWest'22Team FirmWire (Grant, Marius & DominikTBD

Blog posts

So far, we are not aware of any blog posts about FirmWire, but this may change in the future. ;)

Adding your Vulnerability, Talk, or Blogpost to this Trophy Wall

We are happy to hear about your FirmWire usage! If you want to include it into this trophy wall, create first a fork of the FirmWire repository on the GitHub UI. Then, clone the docs branch of your forked FirmWire repository:

$ git clone -b docs git@github.com:your_username/FirmWire.git

Afterwards, edit the trophy_wall.md file and add your resource to the according table, e.g. via:

$ vim FirmWire/docs/src/trophy_wall.md

Once done, push your changes and send us a PullRequest on github!