Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
ldr s0, [x6], #8
mov x0, 1 mov x1, 2 mov x8, 0
(no loop instructions)
Retires: 1.000
Issues: 2.000
Integer unit issues: 1.001
Load/store unit issues: 1.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
1005 | 1220 | 2076 | 1046 | 1030 | 1052 | 1000 | 18209 | 16938 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1041 | 2001 | 1001 | 1000 | 1000 | 1000 | 18641 | 16994 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1057 | 2001 | 1001 | 1000 | 1000 | 1000 | 18621 | 16884 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1044 | 2001 | 1001 | 1000 | 1000 | 1000 | 18461 | 16804 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1045 | 2001 | 1001 | 1000 | 1000 | 1000 | 18053 | 16945 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1040 | 2001 | 1001 | 1000 | 1000 | 1000 | 18329 | 17048 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1046 | 2001 | 1001 | 1000 | 1000 | 1000 | 18581 | 16870 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1042 | 2001 | 1001 | 1000 | 1000 | 1000 | 18537 | 17050 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1056 | 2001 | 1001 | 1000 | 1000 | 1000 | 18537 | 16923 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1063 | 2001 | 1001 | 1000 | 1000 | 1000 | 18145 | 17336 | 2000 | 1000 | 1000 | 1001 | 1000 |
Chain cycles: 3
Code:
ldr s0, [x6], #8 fmov x0, d0 eor x8, x8, x0 eor x8, x8, x0 add x6, x6, x8
mov x0, 1 mov x1, 2 mov x8, 0
(fused SUBS/B.cc loop)
Result (median cycles for code, minus 3 chain cycles): 7.0077
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
50209 | 101385 | 70175 | 50117 | 10051 | 10007 | 40280 | 10055 | 10002 | 2661235 | 767249 | 794779 | 60108 | 30208 | 10003 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100084 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660582 | 767108 | 794637 | 60110 | 30209 | 10004 | 10003 | 60278 | 10012 | 10013 | 50011 | 10000 | 40100 |
50204 | 100090 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660339 | 767031 | 794561 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100086 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660771 | 767169 | 794700 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100096 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660844 | 767197 | 794720 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100090 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660447 | 767067 | 794600 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100077 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660366 | 767040 | 794571 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100081 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660555 | 767103 | 794629 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100076 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2660393 | 767049 | 794576 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 40100 |
50204 | 100077 | 70103 | 50101 | 10002 | 10000 | 40104 | 10003 | 10003 | 2663984 | 768184 | 795683 | 60110 | 30209 | 10004 | 10003 | 60216 | 10003 | 10003 | 50001 | 10000 | 40100 |
Result (median cycles for code, minus 3 chain cycles): 7.0106
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
50029 | 101386 | 70069 | 50016 | 10048 | 10005 | 40154 | 10043 | 10003 | 2660780 | 767628 | 795163 | 60020 | 30029 | 10004 | 10003 | 60198 | 10027 | 10030 | 50031 | 10000 | 0 | 40010 |
50024 | 100080 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660160 | 767415 | 794942 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
50024 | 100518 | 70043 | 50031 | 10008 | 10004 | 40080 | 10021 | 10000 | 2660808 | 767628 | 795145 | 60010 | 30020 | 10000 | 10000 | 52796 | 12430 | 7612 | 40859 | 9555 | 7 | 33285 |
50024 | 100563 | 70044 | 50031 | 10009 | 10004 | 40084 | 10024 | 10000 | 2659897 | 767307 | 794834 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
50024 | 100418 | 70028 | 50021 | 10005 | 10002 | 40045 | 10011 | 10000 | 2660187 | 767432 | 794956 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
50024 | 100064 | 70013 | 50011 | 10002 | 10000 | 40010 | 10000 | 10000 | 2659917 | 767342 | 794869 | 60010 | 30020 | 10000 | 10000 | 60098 | 10012 | 10013 | 50008 | 10000 | 0 | 40010 |
50024 | 100063 | 70013 | 50011 | 10002 | 10000 | 40010 | 10000 | 10000 | 2659944 | 767351 | 794878 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
50024 | 100066 | 70013 | 50011 | 10002 | 10000 | 40010 | 10000 | 10009 | 2661556 | 767815 | 795366 | 60064 | 30049 | 10009 | 10010 | 60196 | 10028 | 10031 | 50031 | 10000 | 0 | 40010 |
50024 | 100889 | 70104 | 50071 | 10021 | 10012 | 40220 | 10061 | 10011 | 2663012 | 768270 | 795855 | 60068 | 30058 | 10013 | 10013 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
50024 | 100503 | 70043 | 50031 | 10008 | 10004 | 40080 | 10021 | 10010 | 2660566 | 767532 | 795081 | 60065 | 30050 | 10010 | 10010 | 60020 | 10000 | 10000 | 50001 | 10000 | 0 | 40010 |
Count: 8
Code:
ldr s0, [x6], #8 ldr s0, [x7], #8 ldr s0, [x8], #8 ldr s0, [x9], #8 ldr s0, [x10], #8 ldr s0, [x11], #8 ldr s0, [x12], #8 ldr s0, [x13], #8
mov x7, x6 mov x8, x6 mov x9, x6 mov x10, x6 mov x11, x6 mov x12, x6 mov x13, x6
(fused SUBS/B.cc loop)
Result (median cycles for code divided by count): 0.5403
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80209 | 44284 | 160525 | 80375 | 80150 | 80375 | 80006 | 240336 | 697260 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43224 | 160109 | 80109 | 80000 | 80112 | 80010 | 240336 | 696618 | 160122 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43215 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43217 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43217 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43217 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43217 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43215 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697060 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43217 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
80204 | 43215 | 160109 | 80109 | 80000 | 80112 | 80006 | 240336 | 697096 | 160118 | 200 | 80016 | 200 | 80016 | 80009 | 80000 | 100 |
Result (median cycles for code divided by count): 0.5402
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80025 | 43608 | 160101 | 80071 | 80030 | 80074 | 80058 | 240225 | 692537 | 160133 | 20 | 80070 | 20 | 80014 | 80010 | 80000 | 10 |
80024 | 43216 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697163 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43218 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697174 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43216 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697037 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43233 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697110 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43217 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697073 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43216 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697073 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43221 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697055 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43218 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697073 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43218 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697222 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |