BFMLALB (vectors): BFloat16 floating-point multiply-add long to single-precision (bottom)

BFMLALB Zda.S, Zn.H, Zm.H (SVE+BF16 (SME+BF16

svfloat32_t svbfmlalb[_f32](svfloat32_t op1, svbfloat16_t op2, svbfloat16_t op3)

128-bit SVE

For each even BFloat16 calculate (1) * (2), and add that to the 32-bit float from (3), then set (4) to the result.

For each even BFloat16 calculate (1) * (2), and add that to the 32-bit float from (3), then set (4) to the result.

For each even BFloat16 calculate (1) * (2), and add that to the 32-bit float from (3), then set (4) to the result.

Larger sizes

For each even BFloat16 calculate (1) * (2), and add that to the 32-bit float from (3), then set (4) to the result.

For each even BFloat16 calculate (1) * (2), and add that to the 32-bit float from (3), then set (4) to the result.