SVE Instruction List by Dougall Johnson
LD1H (scalar plus vector): Gather load unsigned halfwords to vector (vector index)
LD1H { Zt.D }, Pg/Z, [Xn, Zm.D, LSL #1] (SVE+NS
svint64_t svld1uh_gather_[s64]index_s64(svbool_t pg, const uint16_t *base, svint64_t indices)
svuint64_t svld1uh_gather_[s64]index_u64(svbool_t pg, const uint16_t *base, svint64_t indices)
svint64_t svld1uh_gather_[u64]index_s64(svbool_t pg, const uint16_t *base, svuint64_t indices)
svuint64_t svld1uh_gather_[u64]index_u64(svbool_t pg, const uint16_t *base, svuint64_t indices)
128-bit SVE
Gather (load) and zero extend 16-bit values into the 64-bit elements of (3), from a base address (Xn/base), plus each corresponding 64-bit offset from (2) multiplied by two. If the predicate bit from (1) corresponding to an element in (3) is zero, that load is skipped, and cannot cause a fault, and the element is set to zero.
256-bit SVE
Gather (load) and zero extend 16-bit values into the 64-bit elements of (3), from a base address (Xn/base), plus each corresponding 64-bit offset from (2) multiplied by two. If the predicate bit from (1) corresponding to an element in (3) is zero, that load is skipped, and cannot cause a fault, and the element is set to zero.
512-bit SVE
Gather (load) and zero extend 16-bit values into the 64-bit elements of (3), from a base address (Xn/base), plus each corresponding 64-bit offset from (2) multiplied by two. If the predicate bit from (1) corresponding to an element in (3) is zero, that load is skipped, and cannot cause a fault, and the element is set to zero.
Larger sizes
1024-bit SVE
Gather (load) and zero extend 16-bit values into the 64-bit elements of (3), from a base address (Xn/base), plus each corresponding 64-bit offset from (2) multiplied by two. If the predicate bit from (1) corresponding to an element in (3) is zero, that load is skipped, and cannot cause a fault, and the element is set to zero.
2048-bit SVE
Gather (load) and zero extend 16-bit values into the 64-bit elements of (3), from a base address (Xn/base), plus each corresponding 64-bit offset from (2) multiplied by two. If the predicate bit from (1) corresponding to an element in (3) is zero, that load is skipped, and cannot cause a fault, and the element is set to zero.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.