Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
swpl x0, x1, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 2.000
Issues: 2.000
Integer unit issues: 0.001
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch ldst uop (58) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
72005 | 34781 | 2005 | 1 | 2004 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34850 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34247 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34420 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72005 | 34678 | 2003 | 1 | 2002 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34548 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34673 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34319 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34582 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 35593 | 2001 | 1 | 2000 | 2000 | 11767 | 2000 | 2000 | 4000 | 1 | 2000 |
Code:
swpl x0, x1, [x6] add x6, x6, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 6.0057
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30209 | 60706 | 30258 | 10154 | 20104 | 10155 | 20004 | 32906 | 133778 | 30106 | 10202 | 20004 | 10202 | 40008 | 10001 | 20000 | 10100 |
30205 | 60100 | 30162 | 10122 | 20040 | 10122 | 20002 | 32870 | 133659 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32868 | 133809 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32871 | 133543 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32871 | 133631 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32869 | 133773 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32871 | 133742 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32870 | 133710 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32871 | 133677 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32871 | 133741 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
Result (median cycles for code): 6.0064
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30029 | 60568 | 30151 | 10061 | 20090 | 10061 | 20002 | 32633 | 134047 | 30013 | 10021 | 20003 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32630 | 134050 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32630 | 134060 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32630 | 134036 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32630 | 134062 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32630 | 134046 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60086 | 30013 | 10011 | 20002 | 10012 | 20000 | 32626 | 134232 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60064 | 30011 | 10011 | 20000 | 10010 | 20000 | 32626 | 134089 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60064 | 30011 | 10011 | 20000 | 10010 | 20000 | 32626 | 134103 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60064 | 30011 | 10011 | 20000 | 10010 | 20000 | 32626 | 134067 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
Code:
swpl x0, x1, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 9.8242
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20205 | 100706 | 21128 | 102 | 21026 | 101 | 21184 | 504 | 1768022 | 21285 | 214 | 23744 | 206 | 49396 | 1 | 20000 | 100 |
20204 | 99398 | 20716 | 101 | 20615 | 100 | 20317 | 500 | 1741853 | 20417 | 200 | 21090 | 292 | 49948 | 1 | 20000 | 100 |
20204 | 100368 | 21386 | 138 | 21248 | 137 | 21211 | 691 | 1767167 | 21351 | 312 | 23862 | 334 | 48020 | 1 | 20000 | 100 |
20204 | 99745 | 20717 | 117 | 20600 | 116 | 20150 | 505 | 1731137 | 20251 | 204 | 20370 | 298 | 42736 | 1 | 20000 | 100 |
20204 | 99571 | 20573 | 121 | 20452 | 120 | 20046 | 500 | 1740424 | 20146 | 200 | 20130 | 202 | 41544 | 1 | 20000 | 100 |
20205 | 99090 | 20573 | 101 | 20472 | 100 | 20267 | 416 | 1734901 | 20367 | 200 | 20920 | 200 | 41872 | 1 | 20000 | 100 |
20204 | 98648 | 20397 | 101 | 20296 | 100 | 20612 | 500 | 1770022 | 20712 | 200 | 22020 | 200 | 41344 | 1 | 20000 | 100 |
20204 | 98596 | 20256 | 101 | 20155 | 100 | 20037 | 500 | 1731736 | 20137 | 200 | 20114 | 294 | 42024 | 1 | 20000 | 100 |
20204 | 98153 | 21120 | 101 | 21019 | 100 | 20797 | 448 | 1728838 | 20897 | 200 | 22402 | 200 | 40300 | 1 | 20000 | 100 |
20204 | 96266 | 20102 | 101 | 20001 | 100 | 20000 | 450 | 1719855 | 20100 | 200 | 20004 | 202 | 40184 | 1 | 20000 | 100 |
Result (median cycles for code): 10.0303
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20025 | 100256 | 21510 | 12 | 21498 | 11 | 22026 | 51 | 1776196 | 22037 | 22 | 26114 | 22 | 49756 | 1 | 20000 | 10 |
20024 | 100190 | 20901 | 14 | 20887 | 13 | 21740 | 52 | 1774706 | 21751 | 22 | 25274 | 22 | 49628 | 1 | 20000 | 10 |
20024 | 100239 | 21152 | 13 | 21139 | 12 | 21547 | 49 | 1777843 | 21558 | 24 | 24952 | 22 | 48680 | 1 | 20000 | 10 |
20024 | 100064 | 20667 | 12 | 20655 | 11 | 21512 | 53 | 1774724 | 21523 | 22 | 24688 | 24 | 51820 | 1 | 20000 | 10 |
20024 | 100358 | 21270 | 12 | 21258 | 11 | 21602 | 56 | 1775812 | 21615 | 28 | 24898 | 22 | 47668 | 1 | 20000 | 10 |
20024 | 100213 | 21148 | 11 | 21137 | 10 | 21265 | 52 | 1781128 | 21276 | 22 | 23836 | 28 | 52000 | 1 | 20000 | 10 |
20024 | 101023 | 21673 | 17 | 21656 | 16 | 21843 | 51 | 1783948 | 21856 | 26 | 25708 | 30 | 49456 | 1 | 20000 | 10 |
20024 | 100735 | 21695 | 13 | 21682 | 12 | 21558 | 55 | 1780053 | 21571 | 30 | 24834 | 30 | 50140 | 1 | 20000 | 10 |
20024 | 100306 | 21333 | 12 | 21321 | 11 | 21639 | 77 | 1777942 | 21658 | 40 | 25136 | 40 | 50440 | 1 | 20000 | 10 |
20024 | 100570 | 21544 | 11 | 21533 | 10 | 21803 | 55 | 1776451 | 21816 | 30 | 25544 | 34 | 51392 | 1 | 20000 | 10 |