SVE Instruction List by Dougall Johnson
See "FDOT (4-way, vectors)" in the exploration tools

FDOT (4-way, vectors): 8-bit floating-point dot product to single-precision

FDOT Zda.S, Zn.B, Zm.B (SVE2+FP8DOT4+NS (SSVE-FP8DOT4

128-bit SVE

For each 32-bit element, multiply four adjacent groups of 8-bit float elements from (1) and (2), and add the four products to the corresponding single-precision element of (3). The FP8 format for each 8-bit source operand is selected independently by FPMR.

256-bit SVE

For each 32-bit element, multiply four adjacent groups of 8-bit float elements from (1) and (2), and add the four products to the corresponding single-precision element of (3). The FP8 format for each 8-bit source operand is selected independently by FPMR.

512-bit SVE

For each 32-bit element, multiply four adjacent groups of 8-bit float elements from (1) and (2), and add the four products to the corresponding single-precision element of (3). The FP8 format for each 8-bit source operand is selected independently by FPMR.

Larger sizes

1024-bit SVE

For each 32-bit element, multiply four adjacent groups of 8-bit float elements from (1) and (2), and add the four products to the corresponding single-precision element of (3). The FP8 format for each 8-bit source operand is selected independently by FPMR.

2048-bit SVE

For each 32-bit element, multiply four adjacent groups of 8-bit float elements from (1) and (2), and add the four products to the corresponding single-precision element of (3). The FP8 format for each 8-bit source operand is selected independently by FPMR.

Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.