Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
ldr d0, [x6, #8]!
mov x0, 1 mov x1, 2 mov x8, 0
(no loop instructions)
Retires: 1.000
Issues: 2.000
Integer unit issues: 1.001
Load/store unit issues: 1.000
SIMD/FP unit issues: 0.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map ldst uop (7d) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) |
1005 | 1233 | 2073 | 1043 | 1030 | 1052 | 1000 | 18073 | 17048 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1042 | 2001 | 1001 | 1000 | 1000 | 1000 | 18473 | 16813 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1051 | 2001 | 1001 | 1000 | 1000 | 1000 | 17877 | 17227 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1034 | 2001 | 1001 | 1000 | 1000 | 1000 | 17953 | 17304 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1033 | 2001 | 1001 | 1000 | 1000 | 1000 | 18453 | 16972 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1049 | 2001 | 1001 | 1000 | 1000 | 1000 | 18525 | 16814 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1038 | 2001 | 1001 | 1000 | 1000 | 1000 | 18369 | 17246 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1038 | 2001 | 1001 | 1000 | 1000 | 1000 | 18085 | 17154 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1033 | 2001 | 1001 | 1000 | 1000 | 1000 | 18337 | 16944 | 2000 | 1000 | 1000 | 1001 | 1000 |
1004 | 1033 | 2001 | 1001 | 1000 | 1000 | 1000 | 18317 | 16812 | 2000 | 1000 | 1000 | 1001 | 1000 |
Chain cycles: 3
Code:
ldr d0, [x6, #8]! fmov x0, d0 eor x8, x8, x0 eor x8, x8, x0 add x6, x6, x8
mov x0, 1 mov x1, 2 mov x8, 0
(fused SUBS/B.cc loop)
Result (median cycles for code, minus 3 chain cycles): 7.0074
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
50209 | 101151 | 70161 | 50107 | 10049 | 10005 | 40245 | 10045 | 10002 | 2660236 | 766940 | 794475 | 60108 | 30208 | 10003 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100086 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660555 | 767097 | 794638 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100088 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660285 | 767009 | 794550 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100088 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660636 | 767121 | 794662 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100094 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10011 | 2662440 | 767647 | 795233 | 60158 | 30238 | 10013 | 10013 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100074 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660231 | 766993 | 794534 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100074 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660231 | 766993 | 794534 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100074 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660231 | 766993 | 794534 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100074 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660231 | 766993 | 794534 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
50204 | 100074 | 70104 | 50101 | 10003 | 10000 | 40104 | 10003 | 10003 | 2660231 | 766993 | 794534 | 60110 | 30209 | 10004 | 10003 | 60218 | 10004 | 10003 | 50001 | 10000 | 0 | 40100 |
Result (median cycles for code, minus 3 chain cycles): 7.0076
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
50029 | 101204 | 70067 | 50016 | 10046 | 10005 | 40154 | 10044 | 10003 | 2662152 | 768054 | 795579 | 60020 | 30029 | 10004 | 10003 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660862 | 767634 | 795166 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100076 | 70015 | 50011 | 10004 | 10000 | 40010 | 10000 | 10000 | 2660349 | 767472 | 795005 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50025 | 100146 | 70028 | 50018 | 10009 | 10001 | 40045 | 10012 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
50024 | 100066 | 70014 | 50011 | 10003 | 10000 | 40010 | 10000 | 10000 | 2660079 | 767392 | 794925 | 60010 | 30020 | 10000 | 10000 | 60020 | 10000 | 10000 | 50001 | 10000 | 40010 |
Count: 8
Code:
ldr d0, [x6, #8]! ldr d0, [x7, #8]! ldr d0, [x8, #8]! ldr d0, [x9, #8]! ldr d0, [x10, #8]! ldr d0, [x11, #8]! ldr d0, [x12, #8]! ldr d0, [x13, #8]!
mov x7, x6 mov x8, x6 mov x9, x6 mov x10, x6 mov x11, x6 mov x12, x6 mov x13, x6
(fused SUBS/B.cc loop)
Result (median cycles for code divided by count): 0.5403
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80209 | 44187 | 160524 | 80374 | 80150 | 80376 | 80007 | 240324 | 696474 | 0 | 160115 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43223 | 160109 | 80109 | 80000 | 80108 | 80006 | 240324 | 697150 | 0 | 160114 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43220 | 160109 | 80109 | 80000 | 80108 | 80006 | 240324 | 697150 | 0 | 160114 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43220 | 160109 | 80109 | 80000 | 80108 | 80006 | 240324 | 697150 | 0 | 160114 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80205 | 43282 | 160191 | 80161 | 80030 | 80160 | 80007 | 240324 | 697384 | 0 | 160115 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80205 | 43473 | 160189 | 80159 | 80030 | 80161 | 80006 | 240324 | 697186 | 0 | 160114 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43220 | 160109 | 80109 | 80000 | 80108 | 80006 | 240324 | 697294 | 0 | 160114 | 200 | 80012 | 0 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43220 | 160109 | 80109 | 80000 | 80108 | 80006 | 240324 | 697240 | 0 | 160114 | 200 | 80012 | 0 | 202 | 80070 | 80064 | 80000 | 100 |
80204 | 43230 | 160109 | 80109 | 80000 | 80112 | 82240 | 278132 | 713302 | 311 | 164995 | 3914 | 82663 | 23 | 200 | 80012 | 80009 | 80000 | 100 |
80204 | 43230 | 160109 | 80109 | 80000 | 80108 | 80007 | 240324 | 697245 | 0 | 160115 | 200 | 80012 | 0 | 200 | 80012 | 80005 | 80000 | 100 |
Result (median cycles for code divided by count): 0.5402
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule ldst uop (55) | dispatch int uop (56) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map int uop inputs (7f) | map ldst uop inputs (80) | ? int output thing (e9) | ? ldst retires (ed) | ? int retires (ef) |
80029 | 44210 | 160425 | 80275 | 80150 | 80278 | 80007 | 240054 | 697294 | 160025 | 20 | 80012 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43224 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43223 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43223 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43223 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43223 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43233 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43231 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697323 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43230 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697296 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |
80024 | 43225 | 160011 | 80011 | 80000 | 80010 | 80000 | 240030 | 697260 | 160010 | 20 | 80000 | 20 | 80000 | 80001 | 80000 | 10 |