SVE Instruction List by Dougall Johnson
LD1H (scalar plus immediate, consecutive registers): Contiguous load of halfwords to multiple consecutive vectors (immediate index)
LD1H { Zt1.H, Zt2.H, Zt3.H, Zt4.H }, PNg/Z, [Xn{, #imm, MUL VL}] (SVE2.1 (SME2+S
128-bit SVE
data:image/s3,"s3://crabby-images/90df0/90df0de9811d978242fe74919ec1e73bc9b8d640" alt=""
Load 16-bit values from the memory operand (1) into the 16-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
256-bit SVE
data:image/s3,"s3://crabby-images/6807c/6807ca751eec150544add6a3d94df2f50a4ac6c8" alt=""
Load 16-bit values from the memory operand (1) into the 16-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
512-bit SVE
data:image/s3,"s3://crabby-images/15e4b/15e4b95087aa02657f272a0ea7a91865fd707cbb" alt=""
Load 16-bit values from the memory operand (1) into the 16-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
Larger sizes
1024-bit SVE
data:image/s3,"s3://crabby-images/5480c/5480c838fa66fe8d50d1866c8dfaed9ddc5dd14e" alt=""
Load 16-bit values from the memory operand (1) into the 16-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
2048-bit SVE
data:image/s3,"s3://crabby-images/250d3/250d375440aa038d304a25aa36cb3170c4fffbdd" alt=""
Load 16-bit values from the memory operand (1) into the 16-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.