Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
swplb w0, w1, [x6] nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop ; nop
mov x0, 0
(no loop instructions)
Retires (minus 70 nops): 2.000
Issues: 2.000
Integer unit issues: 0.001
Load/store unit issues: 2.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch ldst uop (58) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
72005 | 34488 | 2005 | 1 | 2004 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34190 | 2001 | 1 | 2000 | 2000 | 11762 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34200 | 2001 | 1 | 2000 | 2000 | 11762 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34187 | 2001 | 1 | 2000 | 2000 | 11762 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34216 | 2001 | 1 | 2000 | 2000 | 11762 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34200 | 2001 | 1 | 2000 | 2000 | 11762 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34517 | 2001 | 1 | 2000 | 2000 | 11770 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34587 | 2001 | 1 | 2000 | 2000 | 11760 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34544 | 2001 | 1 | 2000 | 2000 | 11760 | 2000 | 2000 | 4000 | 1 | 2000 |
72004 | 34729 | 2001 | 1 | 2000 | 2000 | 11760 | 2000 | 2000 | 4000 | 1 | 2000 |
Code:
swplb w0, w1, [x6] add x6, x6, 2
(fused SUBS/B.cc loop)
Result (median cycles for code): 6.0057
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30206 | 60630 | 30181 | 10128 | 20053 | 10129 | 20004 | 32894 | 132726 | 30106 | 10202 | 20004 | 10202 | 40008 | 10001 | 20000 | 10100 |
30205 | 60110 | 30163 | 10123 | 20040 | 10123 | 20002 | 32891 | 132731 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132700 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132697 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132703 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132704 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132686 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32891 | 132700 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
30204 | 60057 | 30101 | 10101 | 20000 | 10101 | 20002 | 32900 | 133510 | 30103 | 10201 | 20003 | 10202 | 40008 | 10001 | 20000 | 10100 |
30204 | 60057 | 30103 | 10101 | 20002 | 10102 | 20002 | 32891 | 132697 | 30103 | 10201 | 20003 | 10201 | 40005 | 10001 | 20000 | 10100 |
Result (median cycles for code): 6.0061
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
30024 | 60061 | 30011 | 10011 | 20000 | 10011 | 20000 | 32612 | 133005 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60061 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132843 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132843 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132864 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132849 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132856 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132829 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132868 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132869 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
30024 | 60054 | 30011 | 10011 | 20000 | 10010 | 20000 | 32581 | 132845 | 30010 | 10020 | 20000 | 10020 | 40000 | 10001 | 20000 | 10010 |
Code:
swplb w0, w1, [x6]
mov x7, 8
(fused SUBS/B.cc loop)
Result (median cycles for code): 9.8142
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20205 | 100171 | 20970 | 112 | 0 | 20858 | 111 | 0 | 20526 | 522 | 1755618 | 20627 | 202 | 21640 | 202 | 43536 | 1 | 20000 | 100 |
20204 | 98186 | 20114 | 101 | 0 | 20013 | 100 | 0 | 21136 | 500 | 1759220 | 21236 | 200 | 23456 | 200 | 40640 | 1 | 20000 | 100 |
20204 | 98381 | 20599 | 101 | 0 | 20498 | 100 | 0 | 20704 | 500 | 1754426 | 20804 | 200 | 21976 | 200 | 41056 | 1 | 20000 | 100 |
20204 | 99154 | 20724 | 127 | 0 | 20597 | 126 | 0 | 20761 | 602 | 1725167 | 20900 | 278 | 22382 | 200 | 40872 | 1 | 20000 | 100 |
20204 | 96954 | 20101 | 101 | 0 | 20000 | 100 | 0 | 20589 | 500 | 1741212 | 20689 | 200 | 21682 | 200 | 41144 | 1 | 20000 | 100 |
20204 | 97194 | 20193 | 101 | 0 | 20092 | 100 | 0 | 20042 | 532 | 1739911 | 20143 | 202 | 20140 | 200 | 41424 | 1 | 20000 | 100 |
20204 | 97296 | 20750 | 148 | 0 | 20602 | 147 | 0 | 20123 | 503 | 1724288 | 20224 | 202 | 20354 | 200 | 40732 | 1 | 20000 | 100 |
20204 | 95944 | 20437 | 108 | 0 | 20329 | 107 | 0 | 20047 | 513 | 1733813 | 20147 | 202 | 20154 | 200 | 41216 | 1 | 20000 | 100 |
20204 | 98878 | 20543 | 101 | 0 | 20442 | 100 | 0 | 20305 | 500 | 1766896 | 20405 | 200 | 20980 | 200 | 45776 | 1 | 20000 | 100 |
20204 | 100050 | 21448 | 101 | 0 | 21347 | 100 | 0 | 21382 | 500 | 1785376 | 21482 | 200 | 24448 | 200 | 44140 | 1 | 20000 | 100 |
Result (median cycles for code): 10.0363
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
20026 | 100741 | 21108 | 13 | 21095 | 12 | 21832 | 56 | 1774538 | 21845 | 28 | 25364 | 24 | 48060 | 1 | 20000 | 10 |
20024 | 100171 | 21126 | 12 | 21114 | 11 | 21674 | 55 | 1780101 | 21686 | 24 | 24986 | 22 | 50960 | 1 | 20000 | 10 |
20024 | 100776 | 21464 | 12 | 21452 | 11 | 21282 | 45 | 1776644 | 21292 | 24 | 24042 | 26 | 49480 | 1 | 20000 | 10 |
20024 | 100577 | 21442 | 13 | 21429 | 12 | 21751 | 65 | 1782947 | 21766 | 30 | 25406 | 30 | 50256 | 1 | 20000 | 10 |
20024 | 100584 | 21574 | 15 | 21559 | 14 | 21566 | 49 | 1778436 | 21577 | 24 | 24898 | 22 | 49600 | 1 | 20000 | 10 |
20024 | 100485 | 21523 | 12 | 21511 | 11 | 21454 | 57 | 1775796 | 21466 | 24 | 24292 | 22 | 44324 | 1 | 20000 | 10 |
20024 | 100527 | 21431 | 14 | 21417 | 13 | 22032 | 57 | 1780364 | 22044 | 24 | 26108 | 22 | 48832 | 1 | 20000 | 10 |
20024 | 100476 | 21418 | 12 | 21406 | 11 | 21777 | 55 | 1784720 | 21789 | 28 | 25496 | 20 | 47252 | 1 | 20000 | 10 |
20024 | 100805 | 21548 | 11 | 21537 | 10 | 21753 | 48 | 1777992 | 21763 | 22 | 25256 | 26 | 48380 | 1 | 20000 | 10 |
20024 | 100334 | 21248 | 13 | 21235 | 12 | 21691 | 48 | 1779211 | 21701 | 22 | 25306 | 22 | 49552 | 1 | 20000 | 10 |