Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
swpl w0, w1, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 2.000
Issues: 2.000
Integer unit issues: 0.001
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch ldst uop (58) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
72006 | 34454 | 2011 | 1 | 2010 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34210 | 2001 | 1 | 2000 | 2000 | 11780 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34944 | 2001 | 1 | 2000 | 2000 | 11769 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34256 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34114 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34109 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34142 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34140 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34115 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34109 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
Code:
swpl w0, w1, [x6] add x6, x6, 4
(fused SUBS/B.cc loop)
Result (median cycles for code): 6.0061
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30207 | 60362 | 30190 | 10134 | 20056 | 10135 | 20004 | 32891 | 132904 | 30106 | 10202 | 20004 | 10202 | 40008 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32856 | 132862 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132966 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132967 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132953 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132958 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20045 | 32968 | 133419 | 30166 | 10221 | 20045 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132935 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60054 | 30101 | 10101 | 20000 | 10101 | 20002 | 32887 | 132943 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60061 | 30101 | 10101 | 20000 | 10101 | 20002 | 32856 | 132779 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
Result (median cycles for code): 6.0061
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30027 | 60366 | 30101 | 10044 | 20057 | 10044 | 20002 | 32624 | 133284 | 30013 | 10021 | 20003 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60064 | 30011 | 10011 | 20000 | 10010 | 20000 | 32595 | 133156 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32595 | 133185 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32596 | 133197 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32597 | 133156 | 30010 | 10020 | 20000 | 10021 | 40005 | 10001 | 20000 | 10010 |
30024 | 60057 | 30013 | 10011 | 20002 | 10012 | 20000 | 32641 | 133517 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32643 | 133502 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32599 | 133234 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60057 | 30011 | 10011 | 20000 | 10010 | 20000 | 32621 | 133268 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60064 | 30011 | 10011 | 20000 | 10010 | 20000 | 32599 | 133083 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
Code:
swpl w0, w1, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 9.7685
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20205 | 100678 | 20910 | 103 | 20807 | 102 | 21389 | 511 | 1773322 | 21492 | 210 | 24386 | 260 | 49184 | 1 | 20000 | 100 |
20204 | 98835 | 20737 | 103 | 20634 | 102 | 20288 | 499 | 1751270 | 20388 | 200 | 20862 | 200 | 41636 | 1 | 20000 | 100 |
20204 | 97457 | 20110 | 101 | 20009 | 100 | 20317 | 509 | 1726866 | 20417 | 200 | 20750 | 200 | 40656 | 1 | 20000 | 100 |
20204 | 99197 | 20656 | 102 | 20554 | 101 | 20869 | 500 | 1729767 | 20969 | 200 | 22780 | 202 | 43424 | 2 | 20000 | 100 |
20204 | 98015 | 20144 | 101 | 20043 | 100 | 20310 | 500 | 1742832 | 20410 | 200 | 20912 | 214 | 40908 | 1 | 20000 | 100 |
20204 | 96398 | 20201 | 101 | 20100 | 100 | 20040 | 500 | 1714335 | 20140 | 200 | 20108 | 200 | 40504 | 1 | 20000 | 100 |
20205 | 97181 | 20205 | 103 | 20102 | 102 | 20297 | 500 | 1764579 | 20397 | 200 | 20996 | 200 | 41092 | 1 | 20000 | 100 |
20204 | 97624 | 20288 | 101 | 20187 | 100 | 20065 | 418 | 1718954 | 20166 | 210 | 20196 | 200 | 41724 | 1 | 20000 | 100 |
20204 | 97283 | 20334 | 101 | 20233 | 100 | 20148 | 406 | 1737006 | 20248 | 200 | 20454 | 200 | 40088 | 1 | 20000 | 100 |
20204 | 99062 | 20894 | 101 | 20793 | 100 | 20722 | 360 | 1761442 | 20822 | 200 | 22320 | 200 | 48804 | 1 | 20000 | 100 |
Result (median cycles for code): 10.0461
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20025 | 100175 | 20207 | 12 | 20195 | 11 | 21787 | 49 | 1774480 | 21797 | 20 | 25174 | 24 | 49720 | 1 | 20000 | 10 |
20025 | 100768 | 21357 | 12 | 21345 | 11 | 21887 | 51 | 1781135 | 21898 | 24 | 25658 | 20 | 50648 | 1 | 20000 | 10 |
20024 | 100461 | 21252 | 12 | 21240 | 11 | 20958 | 52 | 1773318 | 20969 | 22 | 23042 | 26 | 48668 | 1 | 20000 | 10 |
20024 | 100490 | 21407 | 13 | 21394 | 12 | 21345 | 53 | 1773621 | 21357 | 24 | 23976 | 20 | 51060 | 1 | 20000 | 10 |
20024 | 100575 | 21412 | 14 | 21398 | 13 | 21599 | 50 | 1775651 | 21609 | 20 | 24754 | 24 | 50088 | 1 | 20000 | 10 |
20024 | 100433 | 21367 | 13 | 21354 | 12 | 21608 | 52 | 1776805 | 21619 | 22 | 24816 | 24 | 49184 | 1 | 20000 | 10 |
20024 | 100480 | 21418 | 12 | 21406 | 11 | 21573 | 49 | 1775950 | 21583 | 20 | 24878 | 26 | 51212 | 1 | 20000 | 10 |
20024 | 100557 | 21330 | 13 | 21317 | 12 | 21791 | 47 | 1784384 | 21801 | 20 | 25664 | 24 | 50084 | 1 | 20000 | 10 |
20024 | 100465 | 21438 | 15 | 21423 | 14 | 21850 | 50 | 1782189 | 21861 | 22 | 25674 | 24 | 49728 | 1 | 20000 | 10 |
20024 | 100282 | 20857 | 11 | 20846 | 10 | 20976 | 50 | 1773723 | 20986 | 20 | 23066 | 30 | 51140 | 1 | 20000 | 10 |