Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
stadd x0, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 3.000
Issues: 3.002
Integer unit issues: 1.003
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
73005 | 34634 | 3018 | 1014 | 2004 | 1002 | 2000 | 7769 | 10520 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34260 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34278 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34233 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2002 | 4004 | 1003 | 2000 | 1000 |
73004 | 34622 | 3003 | 1003 | 2000 | 1000 | 2000 | 7791 | 10544 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34204 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34206 | 3003 | 1003 | 2000 | 1000 | 2000 | 7782 | 10533 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34181 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34219 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1003 | 2000 | 1000 |
73004 | 34324 | 3003 | 1003 | 2000 | 1000 | 2000 | 7770 | 10521 | 3000 | 1000 | 2000 | 2000 | 4000 | 1004 | 2000 | 1000 |
Code:
stadd x0, [x6] add x6, x6, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0056
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40209 | 30867 | 40499 | 20400 | 20099 | 20261 | 20007 | 116608 | 107574 | 40114 | 20207 | 20007 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30063 | 40115 | 20110 | 20005 | 20107 | 20004 | 116585 | 107421 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116587 | 107425 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116594 | 107449 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116593 | 107439 | 40108 | 20204 | 20004 | 30206 | 40008 | 20006 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116575 | 107405 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116583 | 107411 | 40108 | 20204 | 20004 | 30206 | 40008 | 20008 | 20000 | 20100 |
40204 | 30058 | 40112 | 20110 | 20002 | 20104 | 20004 | 116597 | 107441 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116582 | 107419 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
40204 | 30056 | 40107 | 20105 | 20002 | 20104 | 20004 | 116592 | 107429 | 40108 | 20204 | 20004 | 30206 | 40008 | 20005 | 20000 | 20100 |
Result (median cycles for code): 3.0063
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
40029 | 30903 | 40419 | 20320 | 20099 | 20164 | 20004 | 116486 | 107495 | 40018 | 20024 | 20004 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116477 | 107491 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116497 | 107531 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116502 | 107541 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116483 | 107503 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116488 | 107513 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116472 | 107481 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116477 | 107491 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20036 | 113689 | 111919 | 40082 | 20056 | 20036 | 30020 | 40000 | 20007 | 20000 | 20010 |
40024 | 30063 | 40017 | 20017 | 20000 | 20010 | 20000 | 116480 | 107497 | 40010 | 20020 | 20000 | 30020 | 40000 | 20007 | 20000 | 20010 |
Code:
stadd x0, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 12.6469
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30205 | 123725 | 40879 | 20858 | 20021 | 10117 | 20000 | 2450048 | 2311933 | 30100 | 10200 | 20000 | 20200 | 40000 | 21270 | 20000 | 10100 |
30204 | 128490 | 41307 | 21295 | 20012 | 10115 | 20000 | 2382433 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126418 | 41196 | 21196 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
30204 | 126469 | 41201 | 21201 | 20000 | 10100 | 20000 | 2382436 | 2252308 | 30100 | 10200 | 20000 | 20200 | 40000 | 21101 | 20000 | 10100 |
Result (median cycles for code): 12.6476
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30025 | 138094 | 41791 | 21756 | 20035 | 10039 | 20000 | 2456245 | 2313083 | 30010 | 10020 | 20000 | 20020 | 40000 | 21267 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129761 | 41283 | 21283 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30024 | 129811 | 41277 | 21277 | 20000 | 10010 | 20000 | 2456143 | 2312957 | 30010 | 10020 | 20000 | 20020 | 40000 | 21272 | 20000 | 10010 |
30025 | 129853 | 41338 | 21295 | 20043 | 10038 | 20000 | 2456250 | 2312621 | 30010 | 10020 | 20000 | 20020 | 40000 | 20366 | 20000 | 10010 |
30024 | 126467 | 41111 | 21111 | 20000 | 10010 | 20000 | 2388716 | 2253508 | 30010 | 10020 | 20000 | 20020 | 40000 | 21101 | 20000 | 10010 |
30024 | 126476 | 41112 | 21112 | 20000 | 10010 | 20000 | 2388716 | 2253508 | 30010 | 10020 | 20000 | 20020 | 40000 | 21101 | 20000 | 10010 |
30024 | 126476 | 41112 | 21112 | 20000 | 10010 | 20000 | 2388716 | 2253508 | 30010 | 10020 | 20000 | 20020 | 40000 | 21101 | 20000 | 10010 |
30024 | 126476 | 41112 | 21112 | 20000 | 10010 | 20000 | 2388716 | 2253508 | 30010 | 10020 | 20000 | 20020 | 40000 | 21101 | 20000 | 10010 |
30025 | 125220 | 40893 | 20847 | 20046 | 10037 | 20000 | 2388716 | 2253508 | 30010 | 10020 | 20000 | 20020 | 40000 | 21101 | 20000 | 10010 |