Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
swp x0, x1, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 2.000
Issues: 2.000
Integer unit issues: 0.001
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch ldst uop (58) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
72006 | 34529 | 2013 | 1 | 2012 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34155 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 35089 | 2001 | 1 | 2000 | 2000 | 11772 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34268 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34108 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34117 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34128 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34119 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34122 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34108 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
Code:
swp x0, x1, [x6] add x6, x6, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0062
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30209 | 30910 | 30333 | 10188 | 20145 | 10188 | 20006 | 32918 | 128152 | 30109 | 10203 | 20007 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128240 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128298 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30064 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128252 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128188 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128290 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128290 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32908 | 127990 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30104 | 10102 | 20002 | 10102 | 20005 | 32911 | 128152 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
30204 | 30062 | 30105 | 10102 | 20003 | 10102 | 20005 | 32911 | 128224 | 30107 | 10202 | 20006 | 10202 | 40012 | 10002 | 20000 | 10100 |
Result (median cycles for code): 3.0055
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30029 | 30715 | 30219 | 10089 | 20130 | 10089 | 20005 | 32642 | 129151 | 30017 | 10022 | 20006 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30062 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129152 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30056 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129042 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30055 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129098 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30055 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129018 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30056 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129143 | 30010 | 10020 | 20000 | 10023 | 40013 | 10003 | 20000 | 10010 |
30024 | 30058 | 30014 | 10012 | 20002 | 10012 | 20000 | 32602 | 129153 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30058 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129177 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30058 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129052 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 30058 | 30011 | 10011 | 20000 | 10010 | 20000 | 32602 | 129257 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
Code:
swp x0, x1, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 10.4969
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20206 | 101466 | 20173 | 101 | 20072 | 100 | 20010 | 499 | 1803853 | 20110 | 200 | 20022 | 200 | 40040 | 1 | 20000 | 100 |
20204 | 103764 | 20140 | 101 | 20039 | 100 | 20024 | 500 | 1867084 | 20124 | 200 | 20072 | 200 | 40064 | 1 | 20000 | 100 |
20205 | 104841 | 20168 | 104 | 20064 | 104 | 20041 | 500 | 1855249 | 20141 | 200 | 20112 | 200 | 41184 | 1 | 20000 | 100 |
20204 | 108009 | 20150 | 101 | 20049 | 100 | 20056 | 500 | 1848888 | 20156 | 200 | 20168 | 200 | 40352 | 1 | 20000 | 100 |
20204 | 112640 | 20128 | 101 | 20027 | 100 | 20030 | 500 | 1939091 | 20130 | 200 | 20088 | 200 | 40220 | 1 | 20000 | 100 |
20204 | 104155 | 20113 | 101 | 20012 | 100 | 20032 | 501 | 1856492 | 20132 | 200 | 20096 | 200 | 40752 | 1 | 20000 | 100 |
20204 | 104581 | 20143 | 101 | 20042 | 100 | 20019 | 500 | 1867192 | 20119 | 200 | 20060 | 200 | 40008 | 1 | 20000 | 100 |
20204 | 103634 | 20101 | 101 | 20000 | 100 | 20065 | 500 | 1852952 | 20165 | 200 | 20172 | 202 | 40844 | 2 | 20000 | 100 |
20204 | 104435 | 20104 | 101 | 20003 | 100 | 20038 | 410 | 1851827 | 20140 | 204 | 20098 | 200 | 40104 | 1 | 20000 | 100 |
20204 | 105243 | 20115 | 101 | 20014 | 100 | 20044 | 433 | 1878740 | 20147 | 206 | 20114 | 200 | 40144 | 1 | 20000 | 100 |
Result (median cycles for code): 10.0384
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
20025 | 100700 | 20062 | 11 | 20051 | 10 | 20000 | 50 | 1779129 | 20010 | 20 | 20004 | 20 | 40008 | 1 | 20000 | 0 | 10 |
20024 | 100080 | 20011 | 11 | 20000 | 10 | 20000 | 50 | 1779003 | 20010 | 20 | 20000 | 20 | 40264 | 1 | 20000 | 0 | 10 |
20024 | 100417 | 20017 | 11 | 20006 | 10 | 20004 | 50 | 1785766 | 20014 | 20 | 20012 | 20 | 40132 | 1 | 20000 | 0 | 10 |
20024 | 100429 | 20027 | 11 | 20016 | 10 | 20028 | 50 | 1784019 | 20038 | 20 | 20064 | 20 | 40168 | 1 | 20000 | 0 | 10 |
20024 | 100446 | 20023 | 11 | 20012 | 10 | 20000 | 50 | 1779003 | 20010 | 20 | 20000 | 20 | 40000 | 1 | 20000 | 0 | 10 |
20024 | 100073 | 20011 | 11 | 20000 | 10 | 20000 | 50 | 1779003 | 20010 | 20 | 20000 | 20 | 40000 | 1 | 20000 | 0 | 10 |
20024 | 100073 | 20011 | 11 | 20000 | 10 | 20000 | 50 | 1779003 | 20010 | 20 | 20000 | 20 | 40000 | 1 | 20000 | 0 | 10 |
20024 | 100073 | 20011 | 11 | 20000 | 10 | 20074 | 50 | 1781278 | 20084 | 20 | 20120 | 20 | 40072 | 1 | 20000 | 0 | 10 |
20024 | 100246 | 20020 | 11 | 20009 | 10 | 20016 | 50 | 1784145 | 20026 | 20 | 20042 | 20 | 40104 | 1 | 20000 | 0 | 10 |
20024 | 100155 | 20018 | 11 | 20007 | 10 | 20009 | 50 | 1778625 | 20019 | 20 | 20016 | 20 | 40032 | 1 | 20000 | 0 | 10 |