SVE Instruction List by Dougall Johnson
CLASTA (vectors): Conditionally extract element after last to vector register
CLASTA Zdn.S, Pg, Zdn.S, Zm.S (SVE (SME
svfloat32_t svclasta[_f32](svbool_t pg, svfloat32_t fallback, svfloat32_t data)
svint32_t svclasta[_s32](svbool_t pg, svint32_t fallback, svint32_t data)
svuint32_t svclasta[_u32](svbool_t pg, svuint32_t fallback, svuint32_t data)
128-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then broadcast the next element to all 32-bit lanes of (4). If the last corresponding predicate bit is non-zero, broadcast the first (rightmost) element from (1) to all lanes of (4). If all corresponding predicate bits are zero, preserve the value from (3).
256-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then broadcast the next element to all 32-bit lanes of (4). If the last corresponding predicate bit is non-zero, broadcast the first (rightmost) element from (1) to all lanes of (4). If all corresponding predicate bits are zero, preserve the value from (3).
512-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then broadcast the next element to all 32-bit lanes of (4). If the last corresponding predicate bit is non-zero, broadcast the first (rightmost) element from (1) to all lanes of (4). If all corresponding predicate bits are zero, preserve the value from (3).
Larger sizes
1024-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then broadcast the next element to all 32-bit lanes of (4). If the last corresponding predicate bit is non-zero, broadcast the first (rightmost) element from (1) to all lanes of (4). If all corresponding predicate bits are zero, preserve the value from (3).
2048-bit SVE
Find the last (leftmost) 32-bit element from (2) where the corresponding predicate bit in (1) is non-zero, then broadcast the next element to all 32-bit lanes of (4). If the last corresponding predicate bit is non-zero, broadcast the first (rightmost) element from (1) to all lanes of (4). If all corresponding predicate bits are zero, preserve the value from (3).
Report mistakes or give feedback
Inspired by and based on the x86/x64 SIMD Instruction List by Daytime.