SVE Instruction List by Dougall Johnson
See "LD1D (scalar plus immediate, consecutive registers)" in the exploration tools

LD1D (scalar plus immediate, consecutive registers): Contiguous load of doublewords to multiple consecutive vectors (immediate index)

LD1D { Zt1.D, Zt2.D, Zt3.D, Zt4.D }, PNg/Z, [Xn{, #imm, MUL VL}] (SVE2.1 (SME2+S

128-bit SVE

Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.

256-bit SVE

Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.

512-bit SVE

Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.

Larger sizes

1024-bit SVE

Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.

2048-bit SVE

Load 64-bit values from the memory operand (1) into the 64-bit elements of four consecutive registers (2), (3), (4), and (5). After decoding the predicate from its predicate-as-counter representation to a quadruple-length predicate, if the predicate bit corresponding to an element is zero, that load is skipped, and cannot cause a fault, and the element is set to zero. The first destination register number (2) must be divisible by four.

Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.