firmex/python/README.md
nganhkhoa ae41d9ce41 python prototype
detection for
- zip
- ambarella
- flatten device tree
- squashfs
2024-08-27 13:16:45 +07:00

2.8 KiB

This python project is the script versions for Firmex, serves as early development. I want to write the tool in a better structured language then Python, but Python is better for prototype phase.

Firmware Extraction

The main goal for this project is to detect and extract a firmware file. The firmware is often a compressed file with an OS and a filesystem. Although the format of the firmware might be vary across different vendors.

How firmware works

Usually the firmware is stored in a flash device, basically a storage device with very limited capacity. This storage device can be writable or not writable, depending on the type of device. But in modern days, writable flash are more common, as it gives the ability to update the firmware.

When the microcontroller boots up, the CPU processor loads the flash data in and executes following the CPU specification. For example, the CPU specifies that the flash data is segmented into several regions, and the execution starts at a specific region.

Usually the microcontroller is also equipped with a MCU (memory controller unit). When the CPU access the memory, either by store or load, the CPU goes through the MCU and the MCU decides which memory device it and where in the device it should use. This allows for virtual memory.

Firmware contents

The typical firmware usually contains an Operating System, and a (compressed) file system. There are firmware without an OS, especially those that are for very small devices performing a certain task, thus not needing a fully working OS. The file system is often compressed and provides the Linux OS with binary files. If the system is not using Linux, then a Real Time OS (RTOS) might be used. There are several RTOS out there, most notable FreeRTOS.

If these typical firmware are met, then trying to recognize their file system and extract (uncompress) the file system gives binaries files inside. However, some firmware are not designed like that. Some common file system are ...

Some vendors might build (package) their firmware differently, and some might use a different technique to update the firmware, then a full firmware file is not used, rather it could be some weird format that the currently running system (bootloader?) can detect, extract, and replace.

Firmware Analysis

The most common way to analyze a firmware is by using Binwalk. The fork of Binwalk to use is OSPG, which is still being maintained. Binwalk searches for magic signatures. These signatures can be static byte sequences, are logical byte sequences. The resulting detection of Binwalk provides where and length of the file found. Binwalk also supports extraction of found detections. However, Binwalk detections sometimes come out wrong. Because it only reports what signatures are matched, without checking if they are valid.