Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

ADRP

Test 1: uops

Code:

  .word 0x90000020

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 0.500

Integer unit issues: 0.501

Load/store unit issues: 0.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)? int output thing (e9)? int retires (ef)
1004530500500499149749910005001000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000

Test 2: throughput

Count: 8

Code:

  .word 0x90000020
  .word 0x90000021
  .word 0x90000022
  .word 0x90000023
  .word 0x90000024
  .word 0x90000025
  .word 0x90000026
  .word 0x90000027

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 0.2511

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
802042026840010400104001312003940013802262003991080100
802042010740010400104001312003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212003640012802242003990980100
802042008640009400094001212013540047802952003990980100

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 0.2507

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
80024219994002240022400251200524001180020204000180010
80024214694002240022400251200944002580050204001280010
80024200604002040020400231200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010
80024200544001140011400101200494001080020204000180010