SVE Instruction List by Dougall Johnson
See "FMLALLBT (indexed)" in the exploration tools

FMLALLBT (indexed): 8-bit floating-point multiply-add by indexed element to single-precision (bottom top)

FMLALLBT Zda.S, Zn.B, Zm.B[imm] (SVE2+FP8FMA+NS (SSVE-FP8FMA

128-bit SVE

For each 32-bit float element of (4), multiply the 8-bit float at byte position 1 of the corresponding 4-byte group in (2) with the 8-bit float from (1), and add the product to the 32-bit float accumulator (3). Within each 128-bit segment, the value used from (1) is specified by imm. The FP8 format for each 8-bit source operand is selected independently by FPMR.

256-bit SVE

For each 32-bit float element of (4), multiply the 8-bit float at byte position 1 of the corresponding 4-byte group in (2) with the 8-bit float from (1), and add the product to the 32-bit float accumulator (3). Within each 128-bit segment, the value used from (1) is specified by imm. The FP8 format for each 8-bit source operand is selected independently by FPMR.

512-bit SVE

For each 32-bit float element of (4), multiply the 8-bit float at byte position 1 of the corresponding 4-byte group in (2) with the 8-bit float from (1), and add the product to the 32-bit float accumulator (3). Within each 128-bit segment, the value used from (1) is specified by imm. The FP8 format for each 8-bit source operand is selected independently by FPMR.

Larger sizes

1024-bit SVE

For each 32-bit float element of (4), multiply the 8-bit float at byte position 1 of the corresponding 4-byte group in (2) with the 8-bit float from (1), and add the product to the 32-bit float accumulator (3). Within each 128-bit segment, the value used from (1) is specified by imm. The FP8 format for each 8-bit source operand is selected independently by FPMR.

2048-bit SVE

For each 32-bit float element of (4), multiply the 8-bit float at byte position 1 of the corresponding 4-byte group in (2) with the 8-bit float from (1), and add the product to the 32-bit float accumulator (3). Within each 128-bit segment, the value used from (1) is specified by imm. The FP8 format for each 8-bit source operand is selected independently by FPMR.

Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.