Apple Microarchitecture Research by Dougall Johnson M1/A14 P-core (Firestorm): Overview | Base Instructions | SIMD and FP Instructions M1/A14 E-core (Icestorm): Overview | Base Instructions | SIMD and FP Instructions
Code:
sha512h2 q0, q1, v2.2d
movi v0.16b, 1 movi v1.16b, 2 movi v2.16b, 3
(no loop instructions)
Retires: 1.000
Issues: 1.000
Integer unit issues: 0.001
Load/store unit issues: 0.000
SIMD/FP unit issues: 1.000
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch simd uop (57) | ldst uops in schedulers (5b) | dispatch uop (78) | map simd uop (7e) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
1004 | 2034 | 1001 | 1 | 1000 | 1000 | 24769 | 1000 | 1000 | 3000 | 1 | 1000 |
Code:
sha512h2 q0, q1, v2.2d
movi v0.16b, 1 movi v1.16b, 2 movi v2.16b, 3
(fused SUBS/B.cc loop)
Result (median cycles for code): 2.0034
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) | ? int retires (ef) |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 20034 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 249769 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
Result (median cycles for code): 2.0034
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10004 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
10024 | 20034 | 10021 | 21 | 10000 | 0 | 20 | 10000 | 0 | 70 | 0 | 249769 | 10020 | 20 | 0 | 10000 | 20 | 0 | 30000 | 11 | 0 | 10000 | 10 |
Code:
sha512h2 q0, q0, v1.2d
movi v0.16b, 1 movi v1.16b, 2
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0033
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) | ? int retires (ef) |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10006 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 307 | 379850 | 10120 | 202 | 10034 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
Result (median cycles for code): 3.0033
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) | ? int retires (ef) |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10004 | 20 | 30018 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
Code:
sha512h2 q0, q1, v0.2d
movi v0.16b, 1 movi v1.16b, 2
(fused SUBS/B.cc loop)
Result (median cycles for code): 3.0033
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) | ? int retires (ef) |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10006 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
10204 | 30033 | 10101 | 101 | 10000 | 100 | 10000 | 300 | 379694 | 10100 | 200 | 10004 | 200 | 30012 | 1 | 10000 | 100 |
Result (median cycles for code): 3.0033
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map simd uop inputs (81) | ? int output thing (e9) | ? simd retires (ee) | ? int retires (ef) |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30105 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30096 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
10024 | 30033 | 10021 | 21 | 10000 | 20 | 10000 | 70 | 379694 | 10020 | 20 | 10000 | 20 | 30000 | 11 | 10000 | 10 |
Count: 8
Code:
movi v0.16b, 0 sha512h2 q0, q8, v9.2d movi v1.16b, 0 sha512h2 q1, q8, v9.2d movi v2.16b, 0 sha512h2 q2, q8, v9.2d movi v3.16b, 0 sha512h2 q3, q8, v9.2d movi v4.16b, 0 sha512h2 q4, q8, v9.2d movi v5.16b, 0 sha512h2 q5, q8, v9.2d movi v6.16b, 0 sha512h2 q6, q8, v9.2d movi v7.16b, 0 sha512h2 q7, q8, v9.2d
movi v8.16b, 9 movi v9.16b, 10
(fused SUBS/B.cc loop)
Result (median cycles for code divided by count): 2.0004
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80005 | 200 | 0 | 240015 | 1 | 0 | 160000 | 100 |
160205 | 160068 | 80110 | 101 | 80009 | 100 | 80021 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240015 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240096 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240012 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240096 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240012 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240012 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240012 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240015 | 1 | 0 | 160000 | 100 |
160204 | 160034 | 80101 | 101 | 80000 | 100 | 80000 | 300 | 1999769 | 80100 | 200 | 80004 | 200 | 0 | 240012 | 1 | 0 | 160000 | 100 |
Result (median cycles for code divided by count): 2.0004
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80005 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160025 | 160068 | 80020 | 11 | 80009 | 0 | 10 | 80021 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999919 | 80031 | 20 | 0 | 80032 | 20 | 0 | 240093 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160025 | 160068 | 80020 | 11 | 80009 | 0 | 10 | 80021 | 0 | 30 | 0 | 2000069 | 80052 | 20 | 0 | 80060 | 3666 | 2253 | 132120 | 1773 | 838 | 87950 | 1719 |
160025 | 160068 | 80020 | 11 | 80009 | 0 | 10 | 80021 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
160024 | 160034 | 80011 | 11 | 80000 | 0 | 10 | 80000 | 0 | 30 | 0 | 1999769 | 80010 | 20 | 0 | 80000 | 20 | 0 | 240000 | 1 | 0 | 160000 | 10 |
Count: 16
Code:
sha512h2 q0, q16, v17.2d sha512h2 q1, q16, v17.2d sha512h2 q2, q16, v17.2d sha512h2 q3, q16, v17.2d sha512h2 q4, q16, v17.2d sha512h2 q5, q16, v17.2d sha512h2 q6, q16, v17.2d sha512h2 q7, q16, v17.2d sha512h2 q8, q16, v17.2d sha512h2 q9, q16, v17.2d sha512h2 q10, q16, v17.2d sha512h2 q11, q16, v17.2d sha512h2 q12, q16, v17.2d sha512h2 q13, q16, v17.2d sha512h2 q14, q16, v17.2d sha512h2 q15, q16, v17.2d
movi v16.16b, 17 movi v17.16b, 18
(fused SUBS/B.cc loop)
Result (median cycles for code divided by count): 2.0002
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | schedule ldst uop (55) | dispatch int uop (56) | dispatch simd uop (57) | int uops in schedulers (59) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480018 | 1 | 0 | 160000 | 100 |
160205 | 320068 | 160110 | 101 | 160009 | 0 | 100 | 160021 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480108 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480096 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999919 | 160121 | 200 | 160032 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160204 | 320034 | 160101 | 101 | 160000 | 0 | 100 | 160000 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
160205 | 320068 | 160110 | 101 | 160009 | 0 | 100 | 160021 | 300 | 3999769 | 160100 | 200 | 160004 | 200 | 0 | 480012 | 1 | 0 | 160000 | 100 |
Result (median cycles for code divided by count): 2.0002
retire uop (01) | cycle (02) | schedule uop (52) | schedule int uop (53) | schedule simd uop (54) | dispatch int uop (56) | dispatch simd uop (57) | dispatch ldst uop (58) | int uops in schedulers (59) | simd uops in schedulers (5a) | ldst uops in schedulers (5b) | dispatch uop (78) | map int uop (7c) | map ldst uop (7d) | map simd uop (7e) | map int uop inputs (7f) | map ldst uop inputs (80) | map simd uop inputs (81) | ? int output thing (e9) | ? ldst retires (ed) | ? simd retires (ee) | ? int retires (ef) |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999919 | 160031 | 20 | 0 | 160036 | 20 | 0 | 480018 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480096 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999919 | 160031 | 20 | 0 | 160032 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480000 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999919 | 160031 | 20 | 0 | 160036 | 20 | 0 | 480102 | 1 | 0 | 160000 | 10 |
160024 | 320034 | 160011 | 11 | 160000 | 10 | 160000 | 0 | 30 | 0 | 3999769 | 160010 | 20 | 0 | 160000 | 20 | 0 | 480096 | 1 | 0 | 160000 | 10 |