SVE Instruction List by Dougall Johnson
LD1D (scalar plus scalar, consecutive registers): Contiguous load of doublewords to multiple consecutive vectors (scalar index)
LD1D { Zt1.D, Zt2.D, Zt3.D, Zt4.D }, PNg/Z, [Xn, Xm, LSL #3] (SVE2.1 (SME2+S
128-bit SVE
data:image/s3,"s3://crabby-images/6f11a/6f11a0d37f00b9a12c78e25cabf951d0c1290e44" alt=""
Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
256-bit SVE
data:image/s3,"s3://crabby-images/5ba60/5ba60cd63d930e6d2a856769b451e28c476b778d" alt=""
Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
512-bit SVE
data:image/s3,"s3://crabby-images/602c5/602c5a0bb4ce2f07c681ae02455fea0d9db5aeac" alt=""
Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
Larger sizes
1024-bit SVE
data:image/s3,"s3://crabby-images/a8cc2/a8cc24e0f6d2057c647a76693173ea3b81e84951" alt=""
Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
2048-bit SVE
data:image/s3,"s3://crabby-images/b3327/b33279d323f6e3203dc921db8a451c45576d14b4" alt=""
Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.