checkra1n

12:00 PM May 12, 2020

Over the last 10 years, the world has been exponentially expanding its dependence on mobile phones and technology, and thus the demands of delivering devices and services that meet standards of privacy security, and ethics, and thus responsible innovation and usage has shown to be a daunting task. Organizations and even individual people from all over the world have developed the technical expertise and become clever enough to find ways to target iPhones and other kinds of mobile phones both for defense and conspiratorial purposes. There has been a long lasting battle during this time that has put the strengths and weaknesses of platforms like iOS and Android to the test.

During this battle, Apple has taken the lead and has run away with the win by establishing a larger consumer base through development of thorough and well architectured products and the implementation of policies that preserve the integrity of their systems in ways that protect users, their data, as well as providing a seamless and streamlined experience that is ergonomical. Ergonomics is one of the most important facets of technological advances because this is one of the main purposes of technology, which is to make tasks easier, but there is one more important step in preventing chaos from unleashing as the industry grows and expands infinitely.

Security

The one main strength that Apple has leveraged to take control of this battle is the security that they deliver from services provided on your screen all the way down to the boot chain, where critical code that is responsible for booting the device is utilized to create a secure chain of trust from one level of abstraction to another. If the boot chain is compromised, every single service that executes after it is vulnerable to tampering and further compromisation. Also note that this kind of compromise cannot be patched because these critical services needed to boot the device are baked into the SOC during manufacturing and are immutable.

Throughout the course of the millenium, Apple’s platforms have been resilient to various kinds of attacks both at the boot chain and operating system levels. While vulnerable, just to give you an idea, when the iPhone 4 was released in 2010, the last known exploit for the boot chain was released and only after 9 years was another public exploit found. A much larger amount of kernel exploits were used to compromise phones in what people now coin as jailbreaks, but eventually get patched when new software gets released, and thus become useless to organizations like the FBI, CIA, police force, Nation State Organizations, and governments like China, Russia, etc desiring remote or physical access to a device.

On September 27, 2019, checkm8, a new exploit chain used to compromise the SecureROM, was released. This exploit can be used to provide a jailbreak on a particular device for the lifetime of its existence. To the infosec community this was extraordinary not only because it took 9 years for such a thing to be discovered, it raised the potential for new pioneering work that would revolutionize security research on this platform. After checkm8’s initial release, some of the greatest iOS hackers in the world such as qwertyoruiop, iH8Sn0w, axiomX, Jonathan Seals, xerub, etc decided to extend checkm8 to support all relevant devices and develop a toolchain powerful enough to jailbreak any iOS version at will called checkra1n. To better understand this exploit, we will first overview the iOS firmware.

The Boot Chain

The SecureROM is the first code that runs on the Application Processor (AP) during a cold boot, and runs a stripped down version of a bootloader called iBoot. The SecureROM’s code is patterned in the silicon as mask ROM which means it is programmed into the SOC during manufacturing. Because it happens to be the first code that executes on the AP, it is also expected to be the most trusted. The main goal of the SecureROM is to load a first stage bootloader from non-volatile memory securely and boot it. In the SecureROM, there is an emergency recovery mechanism called Device Firmware Upgrade Mode (DFU) which allows you to upload an image over USB. Because iOS pioneered the concept of the SecureBoot chain, these mechanisms have been the enemy of jailbreaking and hackers since day 1. Below is a graph showing all the steps of the bootchain.

Figure 1. Flow diagram of the iOS boot chain.

The SecureROM starts when the device experiences a cold boot. It initializes the hardware and if the boot is normal, the SecureROM proceeds to LLB (Low Level Bootloader), which locates iBoot. iBoot proceeds to boot the device normally by loading the kernelcache into memory and the rest of iOS. If the boot is in DFU mode, then the device requests iBSS from the host and then loads iBoot Epoch Change (iBEC) to proceed towards a recovery mode or OTA update.

The SecureROM, LLB, and iBoot share a lot of the same code. According to newosxbook, all of these components contain code for

Rebase loop: which checks if the code has been loaded at a predefined virtual address (e.g. 0x100000000 for the t8020 ROM, or 0x19c030000 for iBoot-5540). If not, the code is moved by copying using LDP/STP instructions and the X3, X4 registers

Platform bootstrapping: CPU initialization, Setting up the Vector Base Address Register, the MMU and other ARM MSRs.

Task management, allowing multithreading. This is primarily used by the second stage iBoot, but even the SecureROM has support for spinning a USB task, when in DFU mode.

IMG3/IMG4 code: 32-bit iDevices (excluding watches) used the proprietary IMG3, and 64- bit images use IMG4 (a form of DER) to encode components - binaries and graphic images.

Certificate handling: including both an X.509v3 hard-coded certificate, and the DER-code to parse and verify images signed with its public key.

All of these services are expected to be in this list as each stage in the boot process is responsible for managing the Application Processor and its features such as paging/address spaces, interrupts, context switching, and model specific registers. Of course each stage in the boot process is also responsible for cryptographically verifying each consequent stage and therefore necessitates the capability to handle IMG3/IMG4 images, cryptographic certificates, etc. Below is an image highlighting the similarities and differences between each stage’s component and what they are expected to manage

Figure 2. Components of the iOS boot chain.

For some of the concepts that may seem new and unfamiliar,

The GID keys are baked into the processor and are responsible for the decryption of the boot phase images as they are loaded into memory.

PAC (Pointer Authentication Codes) is a new feature on ARMv8.3 processors that allows for the cryptographic signing of pointers by using the unused bits of a pointer (namely the last 16 bits when using 48 bits for translation) to verify against tampering.

KPP (Kernel Patch Protection) is a feature launched in iOS 10 that ensures the integrity of the kernel segments by performing the checks inside the Secure World in EL3. When I discuss checkra1n, this will be explained more.

AMCC is a hardware enforced KPP by using dedicated controllers on the SOC to perform these integrity checks on kernel segments.

SecureROM

In the boot chain, the SecureROM and its consequent stages form what is called “The Chain of Trust”. The trust begins with an original certificate whose bytes are burnt into the ROM. This root certificate would normally pose an issue on matters of integrity and trust but because the ROM is read only, we know it is trustable. This is why the root certificate can be reliable on determining the integrity of all consecutive stages like iBoot and LLB.

As mentioned as before, the first code to execute on the device is the SecureROM. The codebase of the SecureROM is a subset of the iBoot source code, but cannot be updated as it is read only memory, so the iBoot version that the SecureROM has is the same as the iBoot version that the device was manufactured with (iBoot-2696.0.0.133 for A10, or iBoot-4479.0.0.100.4 for the A13).

Other than identification, the SecureROM access is limited because access is disabled during the execution of the later stages. Nonetheless, the first public demonstration of SecureROM dumping was demonstrated by Ramtin Amin, who conjured up a method to perform a MITM attack against the system’s PCIe bus. This extremely significant achievement faded away once the ROM source code leaked along with the rest of the iBoot sources.

The codebase for the SecureROM is small because dumps are only about 150,000 bytes. According to newosxbook, this is the known roles in startup that it plays

Platform startup: The code starts with a call to a board specific platform startup, which sets board specific registers.

Relocation Loop: A check to see if it is already based at its desired address in memory. If not, it relocates itself to that address using the relocation loop described earlier.

continue start: which sets up the address of the main as its return (i.e. in LR), then proceeds to low level initializations, which include setting the VBAR_EL1, exception, interrupt and bootstrap stacks, page tables, and a heap guard page, before returning into the main, which proceeds according to the following stages:

CPU initialization: Initializes the MMU, and sets the value of the System Control Register, SCTLR. The bits set are M (#0, enabling MMU translation), SA (#3, SP alignment check), C and I (#2 and #12, for data and instruction cache), and WXN (#19, for Write-eXecute- Never).

Initialization of clocks, the SRAM bus, and GPIO pins. On A12, the SecureROM sets the ARMv8.3 Pointer Authentication Code (PAC) random seeds.

Miscellaneous initializations: Consisting of clock, internal memory and GPIO pins. System initialization: Consisting of setting up the heap and task subsystem.

Platform early initialization: An i-Device specific initialization routine, branching to other routines for power management, board and chip specific initialization, UART, etc.

Checking the force DFU pin: to determine if Device Firmware Update is forced.

Platform late initialization: Another i-Device specific routine, branching to other board component initialization routines.

Once loaded in memory, booting an image is performed by preparing and jumping, using a function which takes the boot image type, load address, and argument pointer. The function forces dormancy on the hardware and CPU, disables interrupts, and transfers control by calling the load address as a function. Because we are starting up the device, we will never return from this if everything goes smoothly.

iBoot

iBoot, also known as the second stage bootloader (iBootStage2), is the main stage in the boot chain. It starts with a loop for relocation to move the image to a new virtual address. The address changes between releases. In iOS 12 it is 0x1800b0000, and in iOS 13 is 0x19c030000. The relocation loop is shown below. It copies the memory from the current load address to the desired address.

Figure 3. The iBoot relocation loop.

After the relocation, startup continues with initialization of registers such as VBAR_EL1, which is the Vector Base Address, storing the table of interrupt handlers. The LR (Link Register) is set to the generic main function and returns to it. This function performs the behavior similar to the initialization found in the SecureROM but it also sets the main task using a thread starting function. The main task is one of many tasks, and the rest of them are shown in the image below.

Figure 4. iBoot tasks.

The task structure has two magic values (‘task’ and ‘task2’) at their ends, containing the task name, function pointer, and argument, and a pointer to a stack of a given length, pre-initialized to a magic of “stak”. The tasks are linked through doubly linked list entries to the global task list and to the run queue similar to what is found in an operating system task list.

iBoot parses “Environment” variables, which are found in the NVRAM. Some of them are found below, however there are many more that aren’t documented or used by RELEASE builds.

boot-args: set the boot arguments for boot. These will be passed to iOS and be parsed by the Platform Expert

diags: Boot a diagnostics image. The diags image is not present on release devices, but can be found on Switchboard devices, complete with an EFI runtime environment and a rich command line full of diagnostics functions and utilities.

fsboot: The default for release builds, this finds and loads the iOS kernelcache, subject to APTicket verification.

upgrade: Start an upgrade of the system. Used during OTA installation. recover: Fall to recovery mode, possibly using a recovery partition (unused?).

backlight-level: integer specifying display brightness

debug: debug flags

Secure Enclave

iBoot is also finally responsible for initializing the Secure Enclave Processor firmware. The SEP is responsible for a large amount of security critical tasks both in the boot chain and in the operating system. The current ARM TrustZone Secure model utilizes the notion of a Secure and Normal World, which are both governed by a Secure Monitor running in EL3. The Secure World in theory represents the Secure Enclave Processor’s operation, however does not conform to the ARM specification in entirety.

The purpose of the Secure Enclave is to handle keys and other info such as biometrics that is sensitive enough to not be handled by the Application Processor (AP). It is isolated with a hardware filter so the AP cannot access it. It shares RAM with the AP, but it’s portion of the RAM (known as TZ0) is encrypted. The secure enclave itself is a flashable 4MB AKF processor core called the Secure Enclave Processor (SEP). The technology used is similar to the ARM TrustZone but contains proprietary code for Apple KF cores in general and SEP specifically. It is also responsible for generating the UID key on A9 or newer chips that protects user data at rest. Here is a short brief overview in graph form that shows basic operation of the SEP.

Figure 5. SEP operation.

Our current report does not cover the SEP as much in detail because knowledge of it is not required to launch attacks on the SecureROM and DFU. However, for jailbreaking iOS and tampering with the boot chain further, one must explore the depths of the SEP. The SEP is targeted mainly by organizations that would like unrestricted access to the device as the SEP is the sole proprietor of maintaining access to critical resources and data.

Threat Model of iBoot

In considerations of security, regardless of how high or low level the abstractions you’re working with are, a security engineer must be diligent in constructing a reliable threat model so that he or she knows where attacks may come from. This can be extremely useful in enumerating, identifying, and mitigating against compromise. There are many attack surfaces in iBoot, but few are reliable. Here is a list of ones that come to mind. The lengthy amount of time that it took to produce reliable publicly available exploits for the boot chain is attributed to the short list of attack surfaces against it.

USB: All boot components have some form of USB support, with functionality increasing from the bare minimum of DFU (in the SecureROM) to full iTunes protocol support (in iBoot). The A4 “limera1n” exploit was, for almost a decade, the last effective example of a SecureROM vulnerability, and was reliably exploited to provide tethered, unpatchable jailbreaks up until the iPhone 4.

NAND: Each component of the boot chain has to locate the next stage, which involves reading either the partitions through NVMe or the filesystem on iBoot. Targeted malformations could potentially cause memory corruption at one of the boot stages, yielding code execution. Another deliberate malformation could be in the digital signature of each component (specific X.509/DER fields). A mitigating factor is that such attacks require initial low level access to the NAND in order to mount.

Digital Signatures: aside from signature malformation as a vector for memory corruption, there is also the potential for digital signature forgery, which would allow loading a compromised stage. This is highly improbable (an adversary capable of reliably breaking RSA or performing a second pre-image for SHA-256 likely has even more lucrative targets than iPhones), but cannot be entirely discounted. Another risk is that Apple’s root certificate private keys could be stolen or otherwise selectively leak.

All of the potential attack surfaces are resident in all of the boot stages. This list is similar to the list we constructed earlier about the common components of each boot stage. This provides us a good opportunity to investigate further.

DFU Protocol

DFU, or Device Firmware Update, is part of the SecureROM that allows a device to be restored from any state. If a situation arises where the boot process cannot continue past the SecureROM, then the bootchain will not be trusted. In a situation like this, where the device is unable to boot, the device falls back to DFU mode as a failsafe for recovery.

This process enables the user to get their device back into working condition as a last measure if standard recovery methods have failed by restoring the device to factory settings. While normal recovery methods, such as recovery mode, automatically install the latest iOS and firmware, DFU mode allows the user to change and choose which firmware they wish to revert to. By not having iTunes automatically load the most current firmware, DFU skips the iBoot bootloader process. In this manner, it allows for not only standard recovery, but also allows the user to install previous working versions of iOS, jailbreak their device, or unlock their SIM. DFU mode can be requested through a series of GPIO (General Purpose I/O) key presses. The standard requires a USB connection to another device running iTunes. Once powered off, the device can be started in DFU mode by holding the sleep and home button for 10 seconds. After these 10 seconds, release the sleep button while continuing to hold the home button until the device indicates that it is in recovery mode.

This mode is a well documented USB standard, implemented over simple USB control transfers. In this standard, the SecureROM starts a USB task, which then emits a serial string identifier, unique to the device, indicating the DFU mode. The components of this serial number are shown in the figure below.

Figure 6. DFU emitted serial number string.

Once this serial number is emitted, the SecureROM awaits a DFU download request, during which it expects to receive iBSS image. This image is then verified against its signature and APTicket. Once this image is verified by the SecureROM, then the image will be booted and the control transfer will take place and DFU will exit. However, this isn’t the only way the DFU can end; for example, a DFU abort.

USB Control Transfer

In order to better understand how checkm8 exploits DFU mode, it is first necessary to understand the process of USB control transfer. Control transfers are typically used for command and status operations. Transfers can have up to three stages.

The Setup Stage is where the request is sent. Consisting of three packets, the setup packet contains the address and endpoint number. The data packet is sent next with a length of 0. The last packet is an acknowledge packet specifying success or failure.

Setup

The Data Stage consists of one or multiple IN or OUT transfers. The setup packet indicates the amount of data in the transfer, which is usually less or equal to the packet size, which depends on the speed of the bus. If the host is receiving data it issues an IN token, otherwise it’s a OUT token. The next packet is a data packet with the payload of data being received or sent. The next is an acknowledgement packet specifying the success or failure of the phase.

Data

The Status Stage reports the status of the overall request and this varies on the direction of transfer.

The OUT token must be sent if the host successfully received IN data.

Status Out

The IN token must be sent if the function successfully sent OUT data.

Status In

Now back to DFU.

This process starts with an 8-byte setup packet, containing information on the upcoming transfer as expected. Here is what the metadata of the packet looks like more in detail.

Figure 7. USB setup packet.

If w-length is non-zero, then the control transfer will enter the data phase, which, for DFU, is how the data is transferred between device and host. During this phase, data will be packaged into 8-byte to 64-byte packets, depending on the USB speed, and sent sequentially to the device. Upon USB initialization, a temporary buffer is allocated to iBoot’s USB stack, to where data packets are copied upon receipt by the device. Once the transfer is complete, a zero-length packet will be sent to indicate the termination of the data phase, initiating the status phase.

SecureROM Exploit

The source of this exploit is based upon the fact that USB control transfer is utilized during DFU mode. As stated in the above sections, DFU mode is a protocol implemented over USB control transfers. Therefore during DFU, the iBoot USB stack is initialized, triggering the allocation of the temporary data buffer. Upon control transfer, this data buffer is assigned a pointer in the form of a global variable, which the USB stack will use as the destination for incoming data. Upon DFU exit, this stack is turned off, freeing the said buffer. However, the global variable pointing to the data buffer is never nulled out. This freeing without nulling is the basis of this exploit, allowing us to use this variable after the free.

The key to triggering this exploit is to initiate a USB control transfer with a data phase, and then interrupt this transfer mid-way. By initiating this transfer through DFU, we can signal a DFU abort to free the temporary buffer. DFU will then be reentered, re-allocating this data buffer, causing data to be copied into the freed pointer.

For most devices, doing such a DFU abort will not result in anything. However, the A8 and A9 processors are the exception due to a bug in their DFU abort functionality that causes the leak of a task structure. This leaked structure contains the next scheduled tasks in the form of a linked list, which can be manipulated by adding in a fake task structure to achieve direct code execution.

Later versions of these processors cannot be manipulated in this way because their ROM is deterministic enough to re-allocate this buffer in the same place each time. In SecureROM, heap allocation is designed to return the smallest sized hole for the required allocation. In order to break this determinism, heap-shaping is required to provide a perfectly sized hole for our buffer on the next allocation. The heap shape technique used in checkm8 and checkra1n is Heap-Feng-Shui.

Heap-Feng-Shui is performed by initializing multiple USB transfers at once, while also exploiting a state machine bug that allows allocations to persist across USB stack destruction. In this manner, by initiating transfers that stall, their associated structures will then persist, taking up space on the heap. By carefully crafting allocations in this manner, we are able to create this hole we need for pointer re-allocation. We can then use the same principle of using after freeing to obtain a controlled indirect branch after DFU exit that can be used for code execution.

BootKits

And with all the steps necessary for pwning, this results in an arbitrary code execution in the BootROM. The original project for checkm8 was released and used as motivation for a much more complex project named checkra1n, which is a full toolset that beyond exploitation of DFU mode, the toolset also tampers with the remaining phases of the boot chain by booting the device normally and compromising later phases to achieve a bootstrapped startup and eventually complete “jailbreak”.

Eventually iBoot Stage 2 will have the kernel loaded into memory, and to perform a jailbreak one must patch the kernel before jumping to its entry point. This turns out to be tricky as resilient patch finding is needed to effectively accomplish this. The patch finding used in open source kernel exploits and jailbreaks like unc0ver (also known as undecimus) and Electra are suitable for this task with slight modifications (most likely in shellcode form), although I am unsure whether they were actually used as checkra1n is closed source. The patches in question required are the following

task_for_pid(0)
kpp/amcc/ktrr/ppl: disabled before entry point of kernel
vm_map_protect/vm_fault_enter
AppleMobileFileIntegrity
amfi trustcache:
Sandbox
mount filesystem as rw (apfs and LwVM)

iBoot makes use of a “boot trampoline” that puts the CPU in a state similar to reset. The previous image is wiped, the MMU is disabled, all registers are set to 0, including X18, which is never used normally, therefore to find the trampoline all one needs to do is find a MOV X18, #0 instruction. Shellcode can be used to accomplish this as the signature for that instruction is unique. Beyond complications with the current bootloader being cleared (remap SecureROM into SRAM), where to store shellcode (use SoftDFU to load regions with arbitrary data), iBoot clearing large portions of memory by bzero()’ing it out (use hook to check region being cleared so shellcode doesn’t get overwritten), we can now perform these kernel patches that were specified above.

Kernel patching alone is not enough. Once the kernel boots we need to retain code execution in usermode to prepare the jailbroken state. We can embed a ramdisk in our shellcode to hijack EL0 (usermode) and patch the device tree and kernel boot arguments structure in order to map it in and have it used as the root device on top of the existing hierarchy.

Figure 8. Checkra1n!!

And with the addition of a few more patches and tampers with the iOS system that aren’t worth mentioning, we finally have a fully jailbroken device with code execution from the bootchain to EL3, to EL1(kernel mode), to EL0 in usermode. But we’re not done here. The checkra1n project is intended to move from being a functional jailbreak to a fully featured preboot environment that potentiates on-boot kernel extension loading, dual booting, and Linux/Android on the iPhone. This is currently a work in progress and is called pongoOS.

PongoOS

PongoOS is an experimental preboot environment for Apple board built on top of checkra1n. Because code execution is established within the boot chain, the environment can be set up as we please for the sake of loading services that can extend what is typically equipped for a normal boot. One can think of pongoOS as a mini operating system stack built on top of the boot chain. So pongoOS does have a notion of tasks, uses a subset of libc, a driver environment, shell environment, etc. Here is a figure representing what a checkra1n’ed device would look like with pongoOS employed. One of the most relevant features of pongoOS is the patchfinder for the XNU kernel that is both dynamic and persistently employable across iOS builds.

Figure 9. PongoOS.

Here is a cool screenshot of pongoOS being used concurrently with XNU. It’s difficult to find information on how this works as checkra1n and pongoOS are still in development, but this is just mild evidence to show how powerful these toolsets are from both a security standpoint and reverse engineering standpoint. Typically most if not all of these levels of abstractions are blind black boxes.

Figure 10. pongoOS running concurrently with XNU

Android for the iPhone

Shortly after the release of checkra1n, the company Corellium, which also recently had a court case with Apple about their virtualization platform for iOS, released Project Sandcastle, which is a fully stable Android port for the iPhone. This is the first Android port since the original iPhone. Here is a short summary paraphrased from the Sandcastle website on how Correlium was able to develop it.

“The Android port starts with a working and stable Linux kernel, as it is extremely flexible and accommodating for target hardware such as the iPhone’s. Because of powerful VM platforms they previously developed, the port would have been made possible without it. The early kernel port did not involve real phones. The VM had a fast debugger, a diversified toolset of logging and patching. Printing is what needed to be done first, so the original driver effectively was a modified UART interface. The SPI controller was inspired by Samsung, but incompatible, so it had to be rewritten. The I2C block was unique as it was a PASemi product and incapable of interrupt operations, so Corellium had to rewrite it. Interrupt controller was handled by what is typically used on AArch64 devices, which is the ARM GIC standard compliant controller. FIQ had to be rewritten as it is unsupported on AArch64 Linux. Through VM debugging, they were able to get to a ramdisk based command prompt, with devices such as the Real Time Clock and GPIO drivers working. Because this success was achieved on a VM on not on real hardware, when migrating to real hardware they experienced some hiccups. Firstly, the page size on Apple devices is 16KB and required to be that, so booting with the default configuration failed as Linux’s page sizes are 4KB. Starting the second CPU cores caused delays and the solution was to unlock a boot address register in MMIO memory up until Linux boot. The PCI Express interface stack wasn’t difficult to implement and shortly after verifying register writes that perform low level adjustments they were able to get the NVMe storage chip visible in an lspci output. At this point, the VM was running with 16KB page size in the kernel, but the 4KB assumptions were missed instantly later in the operating system. Eventually they were able to fix these issues through laborious work and finally load the AOSP launcher we’re used to seeing on Android. However Android developers force the 4KB page size into the build system, breaking AArch64 convention, and therefore third party apps containing binary libraries built for these small pages will not start on the Android port and need to be rebuilt. Correlium claims that it would not be hard to rebuild because with the addition of one or two compile time options, you would be successful, but with only an APK to work with it cannot be done. Most apps are not rebuildable. Anyways, to finish off the port, the touch controller and WiFi chip drivers were implemented. The common feat to conquer is that both interfaces share the need to load firmware. The firmware is loaded directly for the phone’s flash memory in the APFS file system format, which has no mainline Linux kernel driver. A bunch of Linux hackers have been working on one, and thus all Correllium needed to do was extend the driver by adding support for compressed files and concurrent mounting of subvolumes, which is a common practice as APFS uses the notion of Containers.”

Here is a figure showing what Android for the iPhone using Project Sandcastle looks like on an iPhone 7.

Figure 11. Android X on iPhone 7

KTRW

Before checkra1n was released, members of the Google Project Zero team, mainly Brandon Azad, worked on a fully featured debugger using only a lightning cable over USB/USB-C to debug the kernel on the iPhone. Many researchers were already able to acquire development-fused iPhones with debugging features such as SWD AND JTAG enabled on the Application Processor, AOP (Always On Processor), and SEP (Secure Enclave Processor).

To achieve this, Brandon Azad had to figure out a way to conquer the KTRR (Kernel Text Ready Only Region) hardware mitigation. KTRR enforces W^X and enforces over all memory accesses in the MMU in regions specified by the KTRR protections. Any invalid access will result in a synchronous exception. The MMU is locked down using Apple defined registers such as KTRR_LOWER_EL1, KTRR_UPPER_EL1, and KTRR_LOCK_EL1. KTRR_LOWER_EL1 and KTRR_UPPER_EL1 to define the lower and upper bounds of the locked down MMU regions. Another register named KTRR_LOCK_EL1 is a lockdown register, which means any write of 1 to it will lock the rest of the registers and no modifications can be made. The values will be lost after a core reset after it wakes from sleep.

This actually is not enough to lock down these regions in physical memory to ensure kernel integrity. DMA Attacks from peripherals can cause writes to memory without consultion of the MMU, thus Apple had to bring the KTRR protection to the AMCC (Apple Memory Cache Controller), with its own read only region and lockdown regions in MMIO memory. Here is a graph depicting the protections from the Project Zero site

Figure 12. KTRR in action

The __LAST.__pinst segment contains privileged instructions such as MSR TTBR1_EL1 which sets the page table of the kernel’s address space, and MSR SCTLR_EL1, which can turn off the MMU and thus disable the KTRR protections. In order to bypass KTRR and load executable code into the kernel, one would have the prevent the lockdown via the registers specified above from occurring. This attack surface is slim as there are only small portions of code that deal with cores waking from sleep, found mainly found in a function named LowResetVectorBase.

After a series of complicated steps of extreme digging from previous exploitation primitives and searches in the ARM documentation, what Brandon Azad found was that in his colleague’s previous work to build an iOS Kernel Debugger in 11.1.2, he utilized the DBGCRx_EL1 registers which actually happened to be related to a crash in experiments with them. It eventually led him to try to understand how a CPU core gets halted, because the crash looked like this.

"panicString" : "Attempting to forcibly halt cpu 1\ncpu 1 failed to halt with error -5: halt not supported for this configuration\nDebugger synchronization timed out; waited 10000000 nanoseconds\npanic(cpu 0 caller 0xfffffff00b5c96bc): \"WDT timeout: CPU 1 failed to respond\"@\/BuildRoot\/...

Searching for this string led him to a function named ml_dbgwrap_halt_cpu(), which is found in <osmk/arm64/dbwrap.c>

Here is what the source for ml_dbgwrap_halt_cpu() looks like.

dbgwrap_status_t
ml_dbgwrap_halt_cpu(int cpu_index, uint64_t timeout_ns)
{
    cpu_data_t *cdp = cpu_datap(cpu_index);
    if ((cdp == NULL) || (cdp->coresight_base[CORESIGHT_UTT] == 0))
        return DBGWRAP_ERR_UNSUPPORTED;
...
    volatile dbgwrap_reg_t *dbgWrapReg = (volatile dbgwrap_reg_t *)
            (cdp->coresight_base[CORESIGHT_UTT] + DBGWRAP_REG_OFFSET);

    if (ml_dbgwrap_cpu_is_halted(cpu_index))
        return DBGWRAP_WARN_ALREADY_HALTED;

    /* Clear all other writable bits besides dbgHalt; none of the
     * power-down or reset bits must be set. */
    *dbgWrapReg = DBGWRAP_DBGHALT;
...
    else
        return DBGWRAP_SUCCESS;
}

As stated in the Project Zero blog, this function retrieves a pointer to the cpu_data struct, so the coresight_base array will have one for each AP core. dbgWrapReg seems to be a pointer to some MMIO, and reading and writing to it will directly read and write a coresight register. The comments in the code show that DBGWRAP contain bits that halt, power down and reset a core.

Scrolling down in this file, there’s also a function named ml_dbgwrap_halt_cpu_with_state()

dbgwrap_status_t
ml_dbgwrap_halt_cpu_with_state(int cpu_index, uint64_t timeout_ns,
        dbgwrap_thread_state_t *state)
{
    cpu_data_t *cdp = cpu_datap(cpu_index);
    if ((cdp == NULL) || (cdp->coresight_base[CORESIGHT_ED] == 0))
        return DBGWRAP_ERR_UNSUPPORTED;

    /* Ensure memory-mapped coresight registers can be written */
    *((volatile uint32_t *)(cdp->coresight_base[CORESIGHT_ED]
                + ARM_DEBUG_OFFSET_DBGLAR)) = ARM_DBG_LOCK_ACCESS_KEY;

    dbgwrap_status_t status = ml_dbgwrap_halt_cpu(cpu_index, timeout_ns);

    /* A core that is not fully powered (e.g. idling in wfi) can still be
     * halted; the dbgwrap register and certain coresight registers such
     * EDPRSR are in the always-on domain. However, EDSCR/EDITR are not in
     * the always-on domain and will generate a parity abort on read.
     * EDPRSR can be safely read in all cases, and the OS lock defaults to
     * being set but we clear it first thing, so use that to detect the
     * offline state. */
    if (*((volatile uint32_t *)(cdp->coresight_base[CORESIGHT_ED]
                    + EDPRSR_REG_OFFSET)) & EDPRSR_OSLK) {
        bzero(state, sizeof(*state));
        return DBGWRAP_WARN_CPU_OFFLINE;
    }

    uint32_t instr;

    for (unsigned int i = 0;
            i < (sizeof(state->x) / sizeof(state->x[0])); ++i) {
        instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
            | (4 << 8) | i; // msr DBGDTR0, x<i>
        ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
        state->x[i] = ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);
    }

    instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
        | (4 << 8) | 29; // msr DBGDTR0, fp
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    state->fp = ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);

    instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
        | (4 << 8) | 30; // msr DBGDTR0, lr
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    state->lr = ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);

    /* Stack pointer (x31) can't be used as a register operand for msr;
     * register 31 is treated as xzr rather than sp when used as the
     * transfer operand there.  Instead, load sp into a GPR we've already
     * saved off and then store that register in the DTR.  I've chosen x18
     * as the temporary GPR since it's reserved by the arm64 ABI and unused
     * by xnu, so overwriting it poses the least risk of causing trouble
     * for external debuggers. */

    instr = (0x91U << 24) | (31 << 5) | 18; // mov x18, sp
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
        | (4 << 8) | 18; // msr DBGDTR0, x18
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    state->sp = ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);

    /* reading PC (e.g. through adr) is undefined in debug state.  Instead
     * use DLR_EL0, which contains PC at time of entry into debug state.*/

    instr = (0xD53U << 20) | (1 << 19) | (3 << 16) | (4 << 12)
        | (5 << 8) | (1 << 5) | 18; // mrs    x18, DLR_EL0
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
        | (4 << 8) | 18; // msr DBGDTR0, x18
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    state->pc = ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);

    /* reading CPSR is undefined in debug state.  Instead use DSPSR_EL0,
     * which contains CPSR at time of entry into debug state.*/
    instr = (0xD53U << 20) | (1 << 19) | (3 << 16) | (4 << 12)
        | (5 << 8) | 18; // mrs    x18, DSPSR_EL0
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    instr = (0xD51U << 20) | (2 << 19) | (3 << 16)
        | (4 << 8) | 18; // msr DBGDTR0, x18
    ml_dbgwrap_stuff_instr(cdp, instr, timeout_ns, &status);
    state->cpsr = (uint32_t)ml_dbgwrap_read_dtr(cdp, timeout_ns, &status);

    return status;
}

What this function does is retrieve the register state and halt the core. The way it does this is by forcing the halted CPU to execute instructions that write the general purpose registers to a special debug register called DBGDTR0. The instructions are executed on the fly, not using the __LAST.__pinst or any other KTRR protected region to do this. This means that we might be able to make a core execute a sensitive instruction even if that instruction does not exist in protected KTRR memory.

In the ARMV8 architecture, there is a feature called External Debug and the debug architecture that is embedded within it called CoreSight.

Figure 13. External Debug and CoreSight

CoreSight allows for debugging via an external interface like JTAG or SWD. It has an extremely complex interface.

The External Debug interface provides events that cause the core to enter a debug state. Some of the debug events include

Exception catch debug event: An external debugger can program a core to generate an exception catch debug event every time the core enters a particular exception level.
Reset catch debug event: This event can be generated every time the core resets and is about to execute the reset vector.
Breakpoint debug event: This event can be generated when a core hits a hardware breakpoint.
Watchpoint debug event: This event can be generated when a core hits a hardware watchpoint.
Halting step debug event: This event can be generated after a core completes executing an instruction as part of a single-step operation.

The CoreSight registers are configured in a function named configure_coresight_registers() found in <osmfk/arm64/cpu.c>, and the source shows that the registers are mapped at offsets from a fixed base address from the iOS Device Tree. The Device Tree is passed by iBoot and can be found by traversing the IORegistry.

After a series of more complicated steps that are mentioned further in the blog post, he finally was able to bypass KTRR by

Using DBGWRAP, configure the core to enter debug state next time it resets
Wait for the core to halt. Once it has halted, it is about to execute the first instruction of LowResetVectorBase. The KTRR registers will not yet be initialized.
Using the external debug registers and DBGWRAP, repeatedly single-step execution of the core. On each step, read the PC value on the debugged core to determine the instruction about to be executed.
Once the core is about to execute the KTRR initialization instructions, use the external debug registers to overwrite values in the debugged core’s general purpose registers such that it skips KTRR initialization.
Use DBGWRAP to restart the core, exiting out of debug state. KTRR will be disabled on the MMU, meaning that it is now possible to execute memory outside the KTRR readonly region.

With the MMU disabled, it will finally be possible to execute instructions like MSR TTBR1_EL1 on the AP without the KTRR protections firing and thus we can now move to a new custom page table and remap the kernel into writable and executable pages outside of the original region ranges.

With access to the full range of physical memory, we can now debug the device using the USB to TCP proxy he wrote which relays the debug data over TCP on port 39399. He provides a GDB stub that implements the necessary interface to access to the device tree, JIT memory, page tables, register state, set breakpoints, step instructions, read/write memory, etc.

Debugging with LLDB

As an example, here is bazad debugging an iPhone 8 on iOS 12.1.2 using lldb.

$ lldb kernelcache.iPhone10,1.16C101
(lldb) target create "kernelcache.iPhone10,1.16C101"
Current executable set to 'kernelcache.iPhone10,1.16C101' (arm64).
(lldb) settings set plugin.dynamic-loader.darwin-kernel.load-kexts false
(lldb) gdb-remote 39399
Kernel UUID: 94463A80-7B38-3176-8872-0B8E344C7138
Load Address: 0xfffffff027e04000
Kernel slid 0x20e00000 in memory.
Loaded kernel file kernelcache.iPhone10,1.16C101
Process 2 stopped
Target 0: (kernelcache.iPhone10,1.16C101) stopped.
(lldb)

You can use thread list to list the code running on each physical CPU core. (Note that one core is reserved for the debugger itself, so it will not show up in the list.)

(lldb) th l
Process 2 stopped
* thread #1: tid = 0x0002, 0xfffffff027ffda18 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol1734$$kernelcache.iPhone10,1.16C101 + 272
  thread #2: tid = 0x0003, 0xfffffff027ffda18 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol1734$$kernelcache.iPhone10,1.16C101 + 272
  thread #3: tid = 0x0004, 0xfffffff027ffda18 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol1734$$kernelcache.iPhone10,1.16C101 + 272
  thread #4: tid = 0x0005, 0xfffffff027ffda18 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol1734$$kernelcache.iPhone10,1.16C101 + 272
  thread #5: tid = 0x0006, 0xfffffff027ffda18 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol1734$$kernelcache.iPhone10,1.16C101 + 272
(lldb)

Because KTRR has been disabled in the MMU and the kernel has been remapped as read/write, it is possible to patch kernel memory:

(lldb) x/12wx 0xfffffff027e04000
0xfffffff027e04000: 0xfeedfacf 0x0100000c 0x00000000 0x00000002
0xfffffff027e04010: 0x00000016 0x00001068 0x00200001 0x00000000
0xfffffff027e04020: 0x00000019 0x00000188 0x45545f5f 0x00005458
(lldb) mem wr -s 4 0xfffffff027e04000 0x11223344 0x55667788
(lldb) x/12wx 0xfffffff027e04000
0xfffffff027e04000: 0x11223344 0x55667788 0x00000000 0x00000002
0xfffffff027e04010: 0x00000016 0x00001068 0x00200001 0x00000000
0xfffffff027e04020: 0x00000019 0x00000188 0x45545f5f 0x00005458
Resume executing the kernel with continue. You can interrupt it at any time with ^C:

(lldb) c
Process 2 resuming
(lldb) ^C
Process 2 stopped
Target 0: (kernelcache.iPhone10,1.16C101) stopped.
(lldb)

You can set breakpoints as usual. KTRW currently only supports hardware breakpoints, but LLDB will automatically detect this and set the appropriate breakpoint type:

(lldb) b 0xfffffff0282753b4
Breakpoint 1: where = kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101, address = 0xfffffff0282753b4
(lldb) c
Process 2 resuming
Process 2 stopped
* thread #4, stop reason = breakpoint 1.1
    frame #0: 0xfffffff0282753b4 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101
kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101:
->  0xfffffff0282753b4 <+0>:  sub    sp, sp, #0x80             ; =0x80
    0xfffffff0282753b8 <+4>:  stp    x28, x27, [sp, #0x20]
    0xfffffff0282753bc <+8>:  stp    x26, x25, [sp, #0x30]
    0xfffffff0282753c0 <+12>: stp    x24, x23, [sp, #0x40]
Target 0: (kernelcache.iPhone10,1.16C101) stopped.
(lldb)
Single-stepping works as expected:

(lldb) si
Process 2 stopped
* thread #4, stop reason = instruction step into
    frame #0: 0xfffffff0282753b8 kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101 + 4
kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101:
->  0xfffffff0282753b8 <+4>:  stp    x28, x27, [sp, #0x20]
    0xfffffff0282753bc <+8>:  stp    x26, x25, [sp, #0x30]
    0xfffffff0282753c0 <+12>: stp    x24, x23, [sp, #0x40]
    0xfffffff0282753c4 <+16>: stp    x22, x21, [sp, #0x50]
Target 0: (kernelcache.iPhone10,1.16C101) stopped.
(lldb) si
Process 2 stopped
* thread #4, stop reason = instruction step into
    frame #0: 0xfffffff0282753bc kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101 + 8
kernelcache.iPhone10,1.16C101`___lldb_unnamed_symbol4960$$kernelcache.iPhone10,1.16C101:
->  0xfffffff0282753bc <+8>:  stp    x26, x25, [sp, #0x30]
    0xfffffff0282753c0 <+12>: stp    x24, x23, [sp, #0x40]
    0xfffffff0282753c4 <+16>: stp    x22, x21, [sp, #0x50]
    0xfffffff0282753c8 <+20>: stp    x20, x19, [sp, #0x60]
Target 0: (kernelcache.iPhone10,1.16C101) stopped.
(lldb)

KTRW and checkra1n

checkra1n is now being extended to fully support the capabilities that KTRW provides. checkra1n exploits the BootROM and eventually compromises iBoot, which handles the preparation of KTRR, so the bypass described above isn’t required. Apple does not consider the bypass above as a security vulnerability as the bypass requires kernel read/write, which already means the phone is compromised to the fullest extent. I’m not sure if I agree with this as there are further protections involved that can continue to protect the device, but maybe in the future this can be fixed.

This kind of feat would be pioneering to security research for the iOS platform because not only is the whole boot chain opened up for exploration, but you can now debug the kernel and test exploits and debug crashes much more easily.

Resources

a1exdandy. “Technical Analysis of the checkm8 Exploit.” Digital Security Corporate Blog, Habr, 24 Oct. 2019, habr.com/en/company/dsec/blog/472762/. axi0mX. “axi0mX/Ipwndfu.” GitHub, 1 Oct. 2019, github.com/axi0mX/ipwndfu.

Coolstar. “Coolstar/Electra.” GitHub, github.com/coolstar/electra. “Endpoint Types.” USB in a NutShell - Chapter 4 - Endpoint Types, www.beyondlogic.org/usbnutshell/usb4.shtml.

Googleprojectzero. “Googleprojectzero/Ktrw.” GitHub, Google, 3 Mar. 2020, github.com/googleprojectzero/ktrw.

Levin, Jonathan. Mac OS X and IOS Internals: to the Apples Core. Wiley, 2013. pwn20wndstuff. “pwn20wndstuff/Undecimus.” GitHub, 12 July 2019, github.com/pwn20wndstuff/Undecimus.

qwertyoruiop. “The One Weird Trick SecureROM Hates.” Iokit.racing, iokit.racing/oneweirdtrick.pdf.

“Secure Enclave Overview.” Apple Platform Security, Apple, support.apple.com/guide/security/secure-enclave-overview-sec59b0b31ff/web www.blackhat.com/docs/us-16/materials/us-16-Mandt-Demystifying-The-Secure-Enclave-Processor.pdf. Siguza.github.io, https://siguza.github.io/APRR/.

Siguza.github.io, siguza.github.io/KTRR/.

Siguza.github.io, siguza.github.io/PAN/.

“The Making Of.” Project Sandcastle, Corellium, projectsandcastle.org/history.

Tim. “Project Zero.” Search Results for SecureROM, Google, 1 Jan. 1970, googleprojectzero.blogspot.com/search?q=secureROM.

YungRaj. “YungRaj/expose_kernel_task.” GitHub, github.com/YungRaj/expose_kernel_task.

Written on May 12, 2020