Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
str q0, [x6], #0x10
(no loop instructions)
Retires: 1.000
Issues: 2.000
Integer unit issues: 1.001
Load/store unit issues: 1.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
1005 | 1410 | 2059 | 1041 | 1018 | 1040 | 1000 | 4801 | 18205 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1127 | 2001 | 1001 | 1000 | 1000 | 1000 | 4661 | 18451 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1123 | 2001 | 1001 | 1000 | 1000 | 1000 | 4813 | 18505 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1186 | 2001 | 1001 | 1000 | 1000 | 1000 | 4817 | 19441 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1120 | 2001 | 1001 | 1000 | 1000 | 1000 | 4813 | 19315 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1118 | 2001 | 1001 | 1000 | 1000 | 1000 | 4813 | 18379 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1150 | 2001 | 1001 | 1000 | 1000 | 1000 | 4817 | 19477 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1133 | 2001 | 1001 | 1000 | 1000 | 1000 | 4813 | 18595 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1142 | 2001 | 1001 | 1000 | 1000 | 1000 | 4817 | 19261 | 2000 | 1000 | 2000 | 1001 | 1000 |
1004 | 1118 | 2001 | 1001 | 1000 | 1000 | 1000 | 4817 | 18721 | 2000 | 1000 | 2000 | 1001 | 1000 |
Code:
str q0, [x6], #0x10
(fused SUBS/B.cc loop)
Result (median cycles for code): 1.1427
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
10214 | 15316 | 20693 | 10513 | 10180 | 10512 | 10002 | 44331 | 195130 | 20108 | 200 | 10010 | 200 | 20020 | 10003 | 10000 | 100 |
10204 | 11423 | 20104 | 10104 | 10000 | 10104 | 10001 | 43471 | 194402 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11430 | 20104 | 10104 | 10000 | 10104 | 10001 | 43491 | 194780 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11441 | 20104 | 10104 | 10000 | 10104 | 10001 | 43493 | 195014 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11425 | 20104 | 10104 | 10000 | 10104 | 10001 | 43492 | 194294 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11423 | 20104 | 10104 | 10000 | 10104 | 10001 | 43487 | 194582 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11433 | 20104 | 10104 | 10000 | 10104 | 10001 | 43472 | 194456 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11440 | 20104 | 10104 | 10000 | 10104 | 10001 | 43497 | 194798 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11434 | 20104 | 10104 | 10000 | 10104 | 10001 | 43491 | 194978 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
10204 | 11413 | 20104 | 10104 | 10000 | 10104 | 10001 | 43489 | 194798 | 20105 | 200 | 10008 | 200 | 20016 | 10004 | 10000 | 100 |
Result (median cycles for code): 1.1378
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
10034 | 15404 | 20611 | 10431 | 10180 | 10432 | 10002 | 44653 | 194050 | 20016 | 20 | 10008 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11400 | 20011 | 10011 | 10000 | 10010 | 10000 | 43125 | 193989 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11380 | 20011 | 10011 | 10000 | 10010 | 10000 | 43116 | 194563 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11368 | 20011 | 10011 | 10000 | 10010 | 10000 | 43123 | 194023 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11373 | 20011 | 10011 | 10000 | 10010 | 10000 | 43124 | 193789 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11369 | 20011 | 10011 | 10000 | 10010 | 10000 | 43124 | 193879 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11371 | 20011 | 10011 | 10000 | 10010 | 10000 | 43124 | 193645 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11362 | 20011 | 10011 | 10000 | 10010 | 10000 | 43122 | 194563 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11372 | 20011 | 10011 | 10000 | 10010 | 10000 | 43124 | 194095 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
10024 | 11381 | 20011 | 10011 | 10000 | 10010 | 10000 | 43124 | 193699 | 20010 | 20 | 10000 | 20 | 20000 | 10001 | 10000 | 10 |
Count: 8
Code:
str q0, [x6], #0x10 str q0, [x7], #0x10 str q0, [x8], #0x10 str q0, [x9], #0x10 str q0, [x10], #0x10 str q0, [x11], #0x10 str q0, [x12], #0x10 str q0, [x13], #0x10
mov x7, x6 mov x8, x6 mov x9, x6 mov x10, x6 mov x11, x6 mov x12, x6 mov x13, x6
(fused SUBS/B.cc loop)
Result (median cycles for code divided by count): 1.0011
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80214 | 82451 | 160692 | 80512 | 80180 | 80511 | 80002 | 240312 | 1360162 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80056 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80048 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80048 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160096 | 80037 | 80000 | 100 |
80204 | 80063 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1359979 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80048 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80205 | 80103 | 160154 | 80137 | 80017 | 80143 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160096 | 80037 | 80000 | 100 |
80204 | 80056 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1362879 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80048 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360002 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
80204 | 80048 | 160105 | 80105 | 80000 | 80104 | 80002 | 240312 | 1360051 | 160106 | 200 | 80008 | 200 | 160016 | 80005 | 80000 | 100 |
Result (median cycles for code divided by count): 1.0012
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80034 | 82262 | 160612 | 80432 | 80180 | 80432 | 80002 | 240042 | 1360147 | 160016 | 20 | 80008 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160016 | 80005 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360187 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |
80024 | 80056 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 1360205 | 160010 | 20 | 80000 | 20 | 160000 | 80001 | 80000 | 10 |