SVE Instruction List by Dougall Johnson
LD1RQW (scalar plus scalar): Contiguous load and replicate four words (scalar index)
LD1RQW { Zt.S }, Pg/Z, [Xn, Xm, LSL #2] (SVE (SME
svfloat32_t svld1rq[_f32](svbool_t pg, const float32_t *base)
svint32_t svld1rq[_s32](svbool_t pg, const int32_t *base)
svuint32_t svld1rq[_u32](svbool_t pg, const uint32_t *base)
128-bit SVE
data:image/s3,"s3://crabby-images/c06c3/c06c30ab839fb9f07b1bdbd73c60eaff54e9dfb0" alt=""
Load each 32-bit element in the low 128-bit segment of (3) from the memory operand (2), or zero the element if the corresponding predicate bit in (1) is zero, then replicate that 128-bit segment to fill the register, ignoring the predicate. If the predicate bit corresponding to an element in the low 128-bit segment of (3) is zero, that load is skipped, and cannot cause a fault.
256-bit SVE
data:image/s3,"s3://crabby-images/90b7c/90b7ce6a60ef8f6fc178a49f793e0aabb12621de" alt=""
Load each 32-bit element in the low 128-bit segment of (3) from the memory operand (2), or zero the element if the corresponding predicate bit in (1) is zero, then replicate that 128-bit segment to fill the register, ignoring the predicate. If the predicate bit corresponding to an element in the low 128-bit segment of (3) is zero, that load is skipped, and cannot cause a fault.
512-bit SVE
data:image/s3,"s3://crabby-images/a2798/a2798aa81f174a1ae0e93f1232d34ac33ee9a1cb" alt=""
Load each 32-bit element in the low 128-bit segment of (3) from the memory operand (2), or zero the element if the corresponding predicate bit in (1) is zero, then replicate that 128-bit segment to fill the register, ignoring the predicate. If the predicate bit corresponding to an element in the low 128-bit segment of (3) is zero, that load is skipped, and cannot cause a fault.
Larger sizes
1024-bit SVE
data:image/s3,"s3://crabby-images/82a34/82a343315790368852938c36b5be12fb488bc44c" alt=""
Load each 32-bit element in the low 128-bit segment of (3) from the memory operand (2), or zero the element if the corresponding predicate bit in (1) is zero, then replicate that 128-bit segment to fill the register, ignoring the predicate. If the predicate bit corresponding to an element in the low 128-bit segment of (3) is zero, that load is skipped, and cannot cause a fault.
2048-bit SVE
data:image/s3,"s3://crabby-images/5bae1/5bae1c6901be114cdf3a085ec747a2f0f8c5f76d" alt=""
Load each 32-bit element in the low 128-bit segment of (3) from the memory operand (2), or zero the element if the corresponding predicate bit in (1) is zero, then replicate that 128-bit segment to fill the register, ignoring the predicate. If the predicate bit corresponding to an element in the low 128-bit segment of (3) is zero, that load is skipped, and cannot cause a fault.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.