SVE Instruction List by Dougall Johnson
# BFDOT (vectors): BFloat16 floating-point dot product

BFDOT Zda.S, Zn.H, Zm.H (SVE+BF16 (SME+BF16

svfloat32_t svbfdot[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)

## 128-bit SVE

For each pair of BFloat16s from (1) and (2), compute the dot-product, then add the result to the corresponding 32-bit float accumulator from (3), setting (4) to the total. See

the documentation for the exact order of operations.

## 256-bit SVE

For each pair of BFloat16s from (1) and (2), compute the dot-product, then add the result to the corresponding 32-bit float accumulator from (3), setting (4) to the total. See

the documentation for the exact order of operations.

## 512-bit SVE

For each pair of BFloat16s from (1) and (2), compute the dot-product, then add the result to the corresponding 32-bit float accumulator from (3), setting (4) to the total. See

the documentation for the exact order of operations.

## Larger sizes

## 1024-bit SVE

For each pair of BFloat16s from (1) and (2), compute the dot-product, then add the result to the corresponding 32-bit float accumulator from (3), setting (4) to the total. See

the documentation for the exact order of operations.

## 2048-bit SVE

For each pair of BFloat16s from (1) and (2), compute the dot-product, then add the result to the corresponding 32-bit float accumulator from (3), setting (4) to the total. See

the documentation for the exact order of operations.

Report mistakes or give feedback

Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.