Apple Microarchitecture Research by Dougall Johnson

M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions
M1/A14 E-core (Icestorm):  Overview | Base Instructions | SIMD and FP Instructions

ADR

Test 1: uops

Code:

  adr x0, .+4

(no loop instructions)

1000 unrolls and 1 iteration

Retires: 1.000

Issues: 0.500

Integer unit issues: 0.501

Load/store unit issues: 0.000

SIMD/FP unit issues: 0.000

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)? int output thing (e9)? int retires (ef)
1004532500500499149749910005001000
1004291500500499150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004289500500499150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000
1004280501501500150050010005011000

Test 2: throughput

Count: 8

Code:

  adr x0, .+4
  adr x1, .+4
  adr x2, .+4
  adr x3, .+4
  adr x4, .+4
  adr x5, .+4
  adr x6, .+4
  adr x7, .+4

(fused SUBS/B.cc loop)

100 unrolls and 100 iterations

Result (median cycles for code divided by count): 0.2511

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)dispatch ldst uop (58)int uops in schedulers (59)simd uops in schedulers (5a)dispatch uop (78)map int uop (7c)map ldst uop (7d)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
802042030140010400104001301200390400138022602003991080100
802042009640010400104001301200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042009840010400104001301200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100
802042008640009400094001201200360400128022402003990980100

1000 unrolls and 10 iterations

Result (median cycles for code divided by count): 0.2507

retire uop (01)cycle (02)schedule uop (52)schedule int uop (53)dispatch int uop (56)int uops in schedulers (59)dispatch uop (78)map int uop (7c)map int uop inputs (7f)? int output thing (e9)? int retires (ef)
80024219984002240022400251200974002580050204000180010
80024201434001140011400101200454001080020204000180010
80024200804001140011400101200494001080020204000180010
80024200584001140011400101200494001080020204000180010
80024200584001140011400101200494001080020204000180010
80024200584001140011400101200494001080020204003880010
80024201074001140011400101200454001080020204000180010
80024200584001140011400101200494001080020204000180010
80024200594001140011400101200494001080020204000180010
80024200584001140011400101200494001080020204000180010