SVE Instruction List by Dougall Johnson
CLASTA (scalar): Conditionally extract element after last to general-purpose register
CLASTA Wdn, Pg, Wdn, Zm.S (SVE (SME
float32_t svclasta[_n_f32](svbool_t pg, float32_t fallback, svfloat32_t data)
int32_t svclasta[_n_s32](svbool_t pg, int32_t fallback, svint32_t data)
uint32_t svclasta[_n_u32](svbool_t pg, uint32_t fallback, svuint32_t data)
128-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then set (4) to the next element. If the last corresponding predicate bit is non-zero, set (4) to the first (rightmost) element from (1). If all corresponding predicate bits are zero, preserve the value from the low 32-bits of (3), zeroing the high bits.
256-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then set (4) to the next element. If the last corresponding predicate bit is non-zero, set (4) to the first (rightmost) element from (1). If all corresponding predicate bits are zero, preserve the value from the low 32-bits of (3), zeroing the high bits.
512-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then set (4) to the next element. If the last corresponding predicate bit is non-zero, set (4) to the first (rightmost) element from (1). If all corresponding predicate bits are zero, preserve the value from the low 32-bits of (3), zeroing the high bits.
Larger sizes
1024-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then set (4) to the next element. If the last corresponding predicate bit is non-zero, set (4) to the first (rightmost) element from (1). If all corresponding predicate bits are zero, preserve the value from the low 32-bits of (3), zeroing the high bits.
2048-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then set (4) to the next element. If the last corresponding predicate bit is non-zero, set (4) to the first (rightmost) element from (1). If all corresponding predicate bits are zero, preserve the value from the low 32-bits of (3), zeroing the high bits.
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.