rælize

Eleventh LangSec Workshop at IEEE Security & Privacy, May 15, 2025

# The Art of Fault Injection: Weird Machines all the way down

Cristofaro Mune <u>cristofaro@raelize.com</u> <u>@pulsoid</u> Niek Timmers niek@raelize.com @tieknimmers Building up.

#### Security boundaries



#### Notes from Micro-architectural attacks [2017]

- Security models aren't just a Software (SW) thing
- Most of the Hardware (HW) has no idea of security boundaries:
  - unless factored in during design
- HW resources shared across security boundaries can be problematic
- It's painful to recover

#### Are we STILL missing something?



#### Walking on thin ice...

- The whole computing model assumes that:
  - the right logical values
  - are correctly represented
  - at the rising edge
  - of each clock cycle.
  - Everywhere
- That's why we have constraints on operating conditions (e.g. temperature range)



#### Fig. 1. Internal architecture of digital ICs.

Zussa et al –"Analysis of the fault injection mechanism related to negative and positive power supply glitches using an on-chip voltmeter" - [ZDRC2014]

All the computing in the world relies upon...

# Sampling the correct data

**Everywhere** (in billions of gates)

A few billions of times per second

**Every time.** Every single time.

## What can go wrong?

#### Natural Phenomena





Ziegler, Lanford – "Effects of cosmic rays on computer memories" (1979) <u>May, Woods – "Alpha-particle-induced soft errors in dynamic memories"</u> (1979)

#### Known (attack) techniques



#### Voltage



Electro-magnetic field



Laser ("Nexus-6" kitten)



# Temperature



#### Interestingly...



### Most of them involve transfer of energy

#### Fault Injection Reference Model (FIRM)



# It's all our fault(s)!

#### A fault propagation model



#### Notes

- Geared towards faults in software execution:
  - Not everything is instructions

- Attack against non-CPU subsystem do not easily fit:
  - JTAG
  - OTP
  - RNGs
  - ...
- Example:

• Hardwear.io USA 2022 - "Breaking SoC Security by Glitching OTP Data Transfers" [Raelize]

#### Let's extend it



## Modeling faults.

### The observer's challenge



#### Notes

- Describing all the faults actually introduced in a system is possibly infeasible:
  - We need to observe them to identify them

• Still, it may be possible to formally describe fault models geared toward specific attacks

#### Guess how FI affects code execution...





Microelectronics Reliability Volume 155, April 2024, 115370



Research paper

Software countermeasures against the multiple instructions skip fault model

Experimental analysis of the electromagnetic instruction skip fault model and consequences for software countermeasures

Microelectronics Reliability

Volume 121, June 2021, 114133

Formal verification of a software countermeasure against instruction skip attacks

Nicolas Moro<sup>1,2</sup>, Karine Heydemann<sup>1</sup>, Emmanuelle Encrenaz<sup>1</sup>, and Bruno Robisson<sup>2</sup>

<sup>1</sup>Sorbonne Universités, UPMC Univ Paris 06, UMR 7606, LIP6, 75005 Paris, France firstname.lastname@lip6.fr <sup>2</sup>CEA, CEA-Tech PACA, LSAS, 13541 Gardanne, France firstname.lastname@cea.fr

February 24, 2014

#### Instruction skipping

- The most common description of FI effects (fault model) on CPU execution:
  - Been with us for at least 3 decades 🙂

- First attacks mostly targeted security relevant decisions
  - Smart Card pin authentication
  - Signature checks
  - •

"It is as if...we skipped that instruction"

#### Typical attacks

- Targets:
  - Conditionals:
    - To "skip" the compare instruction
  - Function calls:
    - To "skip" the execution of a security relevant function
  - Infinite loops:
    - To "skip" the current instruction an fall into the next one
- This requires precise targeting of specific instructions:
  - Strong timing requirements
  - Potential targets are easy to predict

### Example



#### Notes

• "Instruction skipping" models fault at the instruction execution level

- The original program continues to be executed
  - We just take an unintended branch in a decision

• Hard to jump at arbitrary locations

#### Attack execution

int load\_exec\_next\_boot\_stage() {

1

6

8 9

10

11 12

13 14

15

16

17 18

19

20 21 // Destination addresses in SRAM
uint32\_t img\_addr = 0xd0000000;
uint32\_t sig\_addr = 0xd1000000;

// Copy next stage image from Flash to SRAM
load\_next\_stage\_img(img\_addr);

// Copy signature from Flash to SRAM
load\_next\_stage\_signature(sig\_addr);

if (verify\_signature(img\_addr, sig\_addr))

// Wrong signature. Reset system
reset\_SOC();

// Signature valid. Exec next stage code
exec\_stage(img\_addr);

• "Instruction skipping"

#### requires accurate timing

- Can be executed blindly:
  - i.e. no assumption on type of fault
  - "Glitch 'n pray"

#### SW countermeasures: Multiple checks

| 1                                                              | in | t load_exec_next_boot_stage() {                                                                             |
|----------------------------------------------------------------|----|-------------------------------------------------------------------------------------------------------------|
| 2<br>3<br>4<br>5<br>6                                          |    | <pre>// Destination addresses in SRAM uint32_t img_addr = 0xd0000000; uint32_t sig_addr = 0xd1000000;</pre> |
| 7<br>8<br>9                                                    |    | <pre>// Copy next stage image from Flash to SRAM load_next_stage_img(img_addr);</pre>                       |
| 10<br>11<br>12                                                 |    | <pre>// Copy signature from Flash to SRAM load_next_stage_signature(sig_addr);</pre>                        |
| 13<br>14<br>15<br>16<br>17<br>18<br>19<br>20<br>21<br>22<br>23 |    | if (verify_signature(img_addr, sig_addr)) {<br>reset_SOC();<br>}                                            |
|                                                                |    | <pre>if (verify_signature(img_addr, sig_addr)) {     reset_SOC(); }</pre>                                   |
|                                                                |    | <pre>if (verify_signature(img_addr, sig_addr)) {     reset_SOC(); }</pre>                                   |
| 24<br>25                                                       |    | // Signature valid. Exec next stage code                                                                    |

exec stage(img addr);

27 }

- Attack assumption:
  - A glitch is required for every check
  - One instruction, one glitch

• Mitigation: Perform multiple checks

#### SW countermeasures: Making synchronization harder

int load\_exec\_next\_boot\_stage() {

// Destination addresses in SRAM
uint32\_t img\_addr = 0xd0000000;
uint32\_t sig\_addr = 0xd1000000;

// Copy next stage image from Flash to SRAM
load\_next\_stage\_img(img\_addr);

// Copy signature from Flash to SPAin load\_next\_stage\_signature(sig\_addr);

random\_delay();

if (verify\_signature(img\_addr,\_sig\_addr)) {
 reset\_SOC();

}

11

12

15

23

random\_delay();

if (verify\_signature(img\_addr, sig\_addr)) {
 reset\_SOC();

random\_delay();

if (verify\_signature(img\_addr, sig\_addr)) {
 reset\_SOC();

random\_delay();

// Signature valid. Exec next stage code
exec\_stage(img\_addr);

- Attack assumption:
  - A glitch must "hit" that instruction at a specific point in time

- Mitigation:
  - Random delays are introduced around critical checks

#### Observations

- SW-based countermeasures are widely used in the industry and academia
  - Multiple checks and random delays are two prominent examples
  - Additional countermeasures available

• Commonly advised and implemented in FI-resistant targets

- They reduce attack success rate:
  - Multiple glitch required
  - Attack timing more difficult

#### A few common beliefs

- "Software is vulnerable to FI":
  - Wrong. Hardware is.

- Source code reviews for fault injections are considered a proper tool for spotting "FI vulnerabilities":
  - We will understand why that is not the case, shortly

#### Untold assumption

#### Instruction skipping is the relevant fault model



#### Test code: Counter (unrolled loop)



## Data analysis (1)

| AMOUNT        | ♦ COLOR | <pre>\$ DELAYMIN</pre> | <pre>\$ DELAYMAX</pre> | <pre>\$LENGTHMIN</pre> | ♣LENGTHMAX | <pre>\$RESPONSE</pre>        |                      |
|---------------|---------|------------------------|------------------------|------------------------|------------|------------------------------|----------------------|
| Aafilter data | Aa F    | Aa                     |                        |                        |            | Aa                           |                      |
| 11            | R       | 1090                   | 1850                   | 2815                   | 4331       | XXXX000003ffYYYY000003ffZZZZ | *                    |
| 5             | R       | 1191                   | 1233                   | 2931                   | 4218       | XXXX3ffe417aYYYY3ffe417aZZZZ |                      |
| 4             | R       | 1735                   | 1790                   | 3098                   | 3853       | XXXX3ffe414eYYYY3ffe414eZZZZ |                      |
| 4             | R       | 1012                   | 1391                   | 2972                   | 3811       | XXXX000003feYYYY000003feZZZZ | Instruction skipping |
| 3             | R       | 1435                   | 1844                   | 2975                   | 4077       | XXXX00000401YYYY00000401ZZZZ |                      |
| 3             | R       | 1471                   | 1475                   | 3946                   | 4211       | XXXX00000407YYYY00000407ZZZZ |                      |
| 2             | R       | 1461                   | 1472                   | 3392                   | 3817       | XXXX00000408YYYY00000408ZZZZ |                      |
| 2             | R       | 1065                   | 1092                   | 3170                   | 3559       | XXXX800812edYYYY800812edZZZZ |                      |

### Something weird...

| ♠ AM  | 10UNT    | \$<br>COLOR | <pre>\$ DELAYMIN</pre> | <pre>DELAYMAX</pre> | <pre>\$LENGTHMIN</pre> | ♣LENGTHMAX | \$RESPONSE                   |
|-------|----------|-------------|------------------------|---------------------|------------------------|------------|------------------------------|
| Aafil | ter dat: | R           |                        |                     |                        |            | Aa                           |
| 1     | 11       | R           | 1090                   | 1850                | 2815                   | 4331       | XXXX000003ffYYYY000003ffZZZZ |
|       | 5        | R           | 1191                   | 1233                | 2931                   | 4218       | XXXX3ffe417aYYYY3ffe417aZZZZ |
|       | 4        | R           | 1735                   | 1790                | 3098                   | 3853       | XXXX3ffe414eYYYY3ffe414eZZZZ |
|       | 4        | R           | 1012                   | 1391                | 2972                   | 3811       | XXXX000003feYYYY000003feZZZZ |
|       | 3        | R           | 1435                   | 1844                | 2975                   | 4077       | xxxx00000401YYYY00000401ZZZZ |
|       | 3        | R           | 1471                   | 1475                | 3946                   | 4211       | XXXX00000407YYYY00000407ZZZZ |
|       | 2        | R           | 1461                   | 1472                | 3392                   | 3817       | xxxx00000408YYYY00000408ZZZZ |
|       | 2        | R           | 1065                   | 1092                | 3170                   | 3559       | XXXX800812edYYYY800812edZZZZ |

How do we explain these results with instruction skipping?

#### ...and weirder...

| AMOUNT        | \$ | COLOR | <pre>\$ DELAYMIN</pre> | <pre>DELAYMAX</pre> | ¢LENGTHMIN | ⇒LENGTHMAX | \$RESPONSE                   |
|---------------|----|-------|------------------------|---------------------|------------|------------|------------------------------|
| Aafilter data | Aa | R     |                        |                     |            |            | Aa                           |
| 11            |    | R     | 1090                   | 1850                | 2815       | 4331       | XXXX000003ffYYYY000003ffZZZZ |
| 5             |    | R     | 1191                   | 1233                | 2931       | 4218       | XXXX3ffe417aYYYY3ffe417aZZZZ |
| 4             |    | R     | 1735                   | 1790                | 3098       | 3853       | XXXX3ffe414eYYYY3ffe414eZZZZ |
| 4             |    | R     | 1012                   | 1391                | 2972       | 3811       | XXXX000003feYYYY000003feZZZZ |
| 3             |    | R     | 1435                   | 1844                | 2975       | 4077       | XXXX00000401YYYY00000401ZZZZ |
| 3             |    | R     | 1471                   | 1475                | 3946       | 4211       | XXXX00000407YYYY00000407ZZZZ |
| 2             |    | R     | 1461                   | 1472                | 3392       | 3817       | XXXX00000408YYYY00000408ZZZZ |
| 2             |    | R     | 1065                   | 1092                | 3170       | 3559       | XXXX800812edYYYY800812edZZZZ |

What are the values in these responses?

#### Some hints

#### Table 1-2. Embedded Memory Address Mapping

| Bus Type | Boundary    | / Address    | Size   | Target          | Comment      |
|----------|-------------|--------------|--------|-----------------|--------------|
| Dus Type | Low Address | High Address | SIZE   |                 |              |
| Data     | 0x3FF8_0000 | 0x3FF8_1FFF  | 8 KB   | RTC FAST Memory | PRO_CPU Only |
|          | 0x3FF8_2000 | 0x3FF8_FFFF  | 56 KB  | Reserved        | -            |
| Data     | 0x3FF9_0000 | 0x3FF9_FFFF  | 64 KB  | Internal ROM 1  | -            |
|          | 0x3FFA_0000 | 0x3FFA_DFFF  | 56 KB  | Reserved        | -            |
| Data     | 0x3FFA_E000 | 0x3FFD_FFFF  | 200 KB | Internal SRAM 2 | DMA          |
| Data     | 0x3FFE 0000 | 0x3FFF FFFF  | 128 KB | Internal SRAM 1 | DMA          |

A memory address? how?

#### Our instruction (+ encoding)



What could be happening?

#### Occam's razor

- Glitches are most likely corrupting instructions
- "Instruction corruption" explains all the responses we see
  - Responses slightly above  $0x400 \rightarrow$  Immediate corruption
  - Responses containing a memory address  $\rightarrow$  Source register corruption
  - Responses below 0x400 (i.e. "instruction skipping")
    - Instruction is mutated into one without side effects. E.g. addi.n a8, a8, 0
- Also all the exceptions can be explained!

#### Instruction skipping...does NOT exist

- Well, it MAY still exist...:
  - But we have a better explanation now for it

- Instruction is likely corrupted to become a NOP-equivalent instruction:
  - i.e. an instruction with no relevant side-effects
  - Examples:
    - orr r0, r0, r0
    - add r0, r0, #0
    - ...

Weird machines... out of Data transfers.

#### Instruction corruption

Glitches may corrupt instructions (examples on ARM32)

| <ul> <li>Single bit corruptions</li> </ul> | add<br>add | x0, x1, x3<br>x0, x1, <mark>x2</mark> | <pre>= 1000101100000011000000000000000000000</pre>             |
|--------------------------------------------|------------|---------------------------------------|----------------------------------------------------------------|
| <ul> <li>Multi bit corruptions</li> </ul>  | ldr<br>str |                                       | ] = 1111100101000000000001001111100000<br>= 111110010000000000 |

- Most chips are affected by this fault model
  - Which bits can be controlled, and how, depends on the target, ...

• As software is modified; any software security model breaks

#### Data transfers are a great target

• All devices transfer data

• From memory to memory

• Using external interfaces



## Transferred data may be under attacker's control



• It's everywhere.

• SW security: Parameters are typically checked (dest, src and n)

• Transferred content itself not considered security critical

## Let's use it as a Fault Injection target...

## PC control with Instruction corruption (ARM32).

#### Example: USB data transfer (ARM32)



PC set to attacker data. Control flow directly hijacked

#### We regularly use this technique...

- Escalating privileges from user to kernel in Linux
  - <u>ROOting the Unexploitable using Hardware Fault Injection @ BlueHat v17</u>

- Bypassing encrypted secure boot
  - <u>Hardening Secure Boot on Embedded Devices</u> @ Blue Hat IL 2019

- Taking control of an AUTOSAR based ECU
  - Attacking AUTOSAR using Software and Hardware Attacks @ escar USA 2019

#### A peculiar attack

• The attack uses data...but it does not target ANY parser

• Targets instruction decoding

- Leverages addressable PC for ARM32:
  - i.e. PC is a generic registers itself and can be explicitly assigned

• Execution flows outside the original program

## Extension to multiple architectures...

## Our research

- We identified multiple variants and techniques
- Yield arbitrary code execution:
  - from controlled data only
  - By corrupting instruction destination registers
- Sufficiently generic to work across multiple architectures
- Examples:
  - Corrupting stored PC (in regs) or SP
  - Hijacking jump/call (through registers)
  - Corrupting callee saved regs (across function calls)

## More details <u>here</u>

#### Example: ARMv8 RET instruction

- Used for returning from a function call.
  - Return address stored in register (default X30)

• It has the following encoding:



• **RET** instruction can encode any register (x0 to x30)

#### Real world example

- Google Bionic's (LIBC) memcpy
- Copying 16 bytes executes the following code:
  - Source data resides in x6 and x7
  - Source data is not wiped before RET

• Glitch RET instruction into RET x6 or RET x7:

• Equivalently glitch ldr x6, ... to ldr x30, ...



memcpy: 0:8b020024 add x4, x1, x2 4:8b020005 add x5, x0, x2 8:f100405f cmp x2, #0x10 c:54000229 b.ls50 <memcpy+0x50 ... 50:f100205f cmp x2, #0x8 54:540000e3 b.cc70 <memcpy+0x70> 58:f9400026 ldr x6, [x1] 5c:f85f8087 ldur x7, [x4, #-8] 60:f9000006 str x6, [x0] 64:f81f80a7 stur x7, [x5, #-8] 68:d65f03c0 ret

## PC hijacked from controlled data.

#### Data scope

• Attacker data may linger in registers, across function boundaries

• Even if out of scope, the data is still available

• An attack may still be possible, at a later point in the execution flow

# Attack example.

#### "Instruction corruption": Recipe for success

- Identify data transfers you control
- Send sled of pointers
  - E.g. Point to your shellcode location
- Glitch during ANY memcpy
- PC control

## A stack overflow...without SW vulns 🙂

#### Attacking Secure Boot



#### SW-based countermeasures bypass



## Key points

- SW-based countermeasures completely ineffective:
  - Countermeasures code not executed

- The attack:
  - does NOT target checks. Is unrelated to checks location (weak locality)
  - Can target ANY data transfer before SW checks

Very hard to protect against. Applicable to FI-resistant targets.

#### Weird machines ingredients

- Memory not executable?
  - We can always start a ROP chain (we have PC control!)

- We can combine multiple glitches
  - if they are sufficiently separated in time

- E.g. We could do ROP and...
  - jump to any infinite loop in the code, at a convenient location
  - jump to our memcpy multiple times and transfer our payload multiple times
  - •

Some open questions.

- Are "instruction skipping" faults a proper subset of "instruction corruption faults"?:
  - i.e. Can "instruction skipping" always be explained by instruction corruptions?

1

• Can experiments be designed to actually prove/disprove the above?

- For the first time the actual data being transferred becomes security relevant
  - Memcpy() is agnostic w.r.t to the transferred data

- Could we distinguish good data from attacker data?...without any knowledge of data structure?
  - If so, how?

• Can we prove that on all architectures, PC control can be obtained by corrupting a limited number of instructions?

- Can we limit attack opportunities by sanitizing data that goes out of scope?
  - E.g. should we clean temporary registers when exiting a function? What would be the impact?

• How many other relevant fault models are we missing?

- Can we somehow classify/identify them?
  - E.g:
    - given a single function and attacker-controlled data, can we formally describe all the ways to achieve PC control?

- Can we design an ISA such as that instruction corruption becomes much less relevant or much harder?
  - E.g. hamming distance in instruction encoding?

## Conclusion.

## Final considerations

- Perturbations at the physical level can lead to unusual attacks
- Abstracting hardware/physics away is not always a good idea
- Fl makes data-based attacks possible on software, without any parser being involved
- Languages and systems are typically not designed to withstand recent fault injection attacks
- We may benefit from a holistic view of systems security



# Thank you! Any questions!?

Cristofaro Mune cristofaro@raelize.com @pulsoid Niek Timmers niek@raelize.com @tieknimmers