SVE Instruction List by Dougall Johnson
FMMLA: Floating-point matrix multiply-accumulate
FMMLA Zda.S, Zn.S, Zm.S (SVE+F32MM+NS
svfloat32_t svmmla[_f32](svfloat32_t op1, svfloat32_t op2, svfloat32_t op3)
128-bit SVE
Within each 128-bit segment, interpreting the 32-bit floats from (1), (2) and (3) as 2-by-2 matrices, multiply (1) by (2), add the resulting 2-by-2 matrix to (3), and write the result to (4). See
the documentation for the exact order of operations.
256-bit SVE
Within each 128-bit segment, interpreting the 32-bit floats from (1), (2) and (3) as 2-by-2 matrices, multiply (1) by (2), add the resulting 2-by-2 matrix to (3), and write the result to (4). See
the documentation for the exact order of operations.
512-bit SVE
Within each 128-bit segment, interpreting the 32-bit floats from (1), (2) and (3) as 2-by-2 matrices, multiply (1) by (2), add the resulting 2-by-2 matrix to (3), and write the result to (4). See
the documentation for the exact order of operations.
Larger sizes
1024-bit SVE
Within each 128-bit segment, interpreting the 32-bit floats from (1), (2) and (3) as 2-by-2 matrices, multiply (1) by (2), add the resulting 2-by-2 matrix to (3), and write the result to (4). See
the documentation for the exact order of operations.
2048-bit SVE
Within each 128-bit segment, interpreting the 32-bit floats from (1), (2) and (3) as 2-by-2 matrices, multiply (1) by (2), add the resulting 2-by-2 matrix to (3), and write the result to (4). See
the documentation for the exact order of operations.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.