Technologies | NEON Intrinsics Reference

int8x8_t vadd_s8 (int8x8_t a, int8x8_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vaddq_s8 (int8x16_t a, int8x16_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vadd_s16 (int16x4_t a, int16x4_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vaddq_s16 (int16x8_t a, int16x8_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vadd_s32 (int32x2_t a, int32x2_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vaddq_s32 (int32x4_t a, int32x4_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vadd_s64 (int64x1_t a, int64x1_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vaddq_s64 (int64x2_t a, int64x2_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vadd_u8 (uint8x8_t a, uint8x8_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vaddq_u8 (uint8x16_t a, uint8x16_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vadd_u16 (uint16x4_t a, uint16x4_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vaddq_u16 (uint16x8_t a, uint16x8_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vadd_u32 (uint32x2_t a, uint32x2_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vaddq_u32 (uint32x4_t a, uint32x4_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vadd_u64 (uint64x1_t a, uint64x1_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vaddq_u64 (uint64x2_t a, uint64x2_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vadd_f32 (float32x2_t a, float32x2_t b)Floating-point add

Description

Floating-point Add (vector). This instruction adds corresponding vector elements in the two source SIMD&FP registers, writes the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vaddq_f32 (float32x4_t a, float32x4_t b)Floating-point add

Description

A64 Instruction

FADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vadd_f64 (float64x1_t a, float64x1_t b)Floating-point add

Description

A64 Instruction

FADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vaddq_f64 (float64x2_t a, float64x2_t b)Floating-point add

Description

A64 Instruction

FADD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int64_t vaddd_s64 (int64_t a, int64_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint64_t vaddd_u64 (uint64_t a, uint64_t b)Add

Description

Add (vector). This instruction adds corresponding elements in the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int16x8_t vaddl_s8 (int8x8_t a, int8x8_t b)Signed add long

Description

Signed Add Long (vector). This instruction adds each vector element in the lower or upper half of the first source SIMD&FP register to the corresponding vector element of the second source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values in this instruction are signed integer values.

A64 Instruction

SADDL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vaddl_s16 (int16x4_t a, int16x4_t b)Signed add long

Description

A64 Instruction

SADDL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vaddl_s32 (int32x2_t a, int32x2_t b)Signed add long

Description

A64 Instruction

SADDL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vaddl_u8 (uint8x8_t a, uint8x8_t b)Unsigned add long

Description

Unsigned Add Long (vector). This instruction adds each vector element in the lower or upper half of the first source SIMD&FP register to the corresponding vector element of the second source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values in this instruction are unsigned integer values.

A64 Instruction

UADDL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vaddl_u16 (uint16x4_t a, uint16x4_t b)Unsigned add long

Description

A64 Instruction

UADDL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vaddl_u32 (uint32x2_t a, uint32x2_t b)Unsigned add long

Description

A64 Instruction

UADDL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vaddl_high_s8 (int8x16_t a, int8x16_t b)Signed add long

Description

A64 Instruction

SADDL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vaddl_high_s16 (int16x8_t a, int16x8_t b)Signed add long

Description

A64 Instruction

SADDL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vaddl_high_s32 (int32x4_t a, int32x4_t b)Signed add long

Description

A64 Instruction

SADDL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vaddl_high_u8 (uint8x16_t a, uint8x16_t b)Unsigned add long

Description

A64 Instruction

UADDL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vaddl_high_u16 (uint16x8_t a, uint16x8_t b)Unsigned add long

Description

A64 Instruction

UADDL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vaddl_high_u32 (uint32x4_t a, uint32x4_t b)Unsigned add long

Description

A64 Instruction

UADDL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int16x8_t vaddw_s8 (int16x8_t a, int8x8_t b)Signed add wide

Description

Signed Add Wide. This instruction adds vector elements of the first source SIMD&FP register to the corresponding vector elements in the lower or upper half of the second source SIMD&FP register, places the results in a vector, and writes the vector to the SIMD&FP destination register.

A64 Instruction

SADDW Vd.8H,Vn.8H,Vm.8B

Argument Preparation

a → Vn.8H 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vaddw_s16 (int32x4_t a, int16x4_t b)Signed add wide

Description

A64 Instruction

SADDW Vd.4S,Vn.4S,Vm.4H

Argument Preparation

a → Vn.4S 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vaddw_s32 (int64x2_t a, int32x2_t b)Signed add wide

Description

A64 Instruction

SADDW Vd.2D,Vn.2D,Vm.2S

Argument Preparation

a → Vn.2D 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vaddw_u8 (uint16x8_t a, uint8x8_t b)Unsigned add wide

Description

Unsigned Add Wide. This instruction adds the vector elements of the first source SIMD&FP register to the corresponding vector elements in the lower or upper half of the second source SIMD&FP register, places the result in a vector, and writes the vector to the SIMD&FP destination register. The vector elements of the destination register and the first source register are twice as long as the vector elements of the second source register. All the values in this instruction are unsigned integer values.

A64 Instruction

UADDW Vd.8H,Vn.8H,Vm.8B

Argument Preparation

a → Vn.8H 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vaddw_u16 (uint32x4_t a, uint16x4_t b)Unsigned add wide

Description

A64 Instruction

UADDW Vd.4S,Vn.4S,Vm.4H

Argument Preparation

a → Vn.4S 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vaddw_u32 (uint64x2_t a, uint32x2_t b)Unsigned add wide

Description

A64 Instruction

UADDW Vd.2D,Vn.2D,Vm.2S

Argument Preparation

a → Vn.2D 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vaddw_high_s8 (int16x8_t a, int8x16_t b)Signed add wide

Description

A64 Instruction

SADDW2 Vd.8H,Vn.8H,Vm.16B

Argument Preparation

a → Vn.8H 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vaddw_high_s16 (int32x4_t a, int16x8_t b)Signed add wide

Description

A64 Instruction

SADDW2 Vd.4S,Vn.4S,Vm.8H

Argument Preparation

a → Vn.4S 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vaddw_high_s32 (int64x2_t a, int32x4_t b)Signed add wide

Description

A64 Instruction

SADDW2 Vd.2D,Vn.2D,Vm.4S

Argument Preparation

a → Vn.2D 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vaddw_high_u8 (uint16x8_t a, uint8x16_t b)Unsigned add wide

Description

A64 Instruction

UADDW2 Vd.8H,Vn.8H,Vm.16B

Argument Preparation

a → Vn.8H 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vaddw_high_u16 (uint32x4_t a, uint16x8_t b)Unsigned add wide

Description

A64 Instruction

UADDW2 Vd.4S,Vn.4S,Vm.8H

Argument Preparation

a → Vn.4S 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vaddw_high_u32 (uint64x2_t a, uint32x4_t b)Unsigned add wide

Description

A64 Instruction

UADDW2 Vd.2D,Vn.2D,Vm.4S

Argument Preparation

a → Vn.2D 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vhadd_s8 (int8x8_t a, int8x8_t b)Signed halving add

Description

Signed Halving Add. This instruction adds corresponding signed integer values from the two source SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SHADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vhaddq_s8 (int8x16_t a, int8x16_t b)Signed halving add

Description

A64 Instruction

SHADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vhadd_s16 (int16x4_t a, int16x4_t b)Signed halving add

Description

A64 Instruction

SHADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vhaddq_s16 (int16x8_t a, int16x8_t b)Signed halving add

Description

A64 Instruction

SHADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vhadd_s32 (int32x2_t a, int32x2_t b)Signed halving add

Description

A64 Instruction

SHADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vhaddq_s32 (int32x4_t a, int32x4_t b)Signed halving add

Description

A64 Instruction

SHADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vhadd_u8 (uint8x8_t a, uint8x8_t b)Unsigned halving add

Description

Unsigned Halving Add. This instruction adds corresponding unsigned integer values from the two source SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UHADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vhaddq_u8 (uint8x16_t a, uint8x16_t b)Unsigned halving add

Description

A64 Instruction

UHADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vhadd_u16 (uint16x4_t a, uint16x4_t b)Unsigned halving add

Description

A64 Instruction

UHADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vhaddq_u16 (uint16x8_t a, uint16x8_t b)Unsigned halving add

Description

A64 Instruction

UHADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vhadd_u32 (uint32x2_t a, uint32x2_t b)Unsigned halving add

Description

A64 Instruction

UHADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vhaddq_u32 (uint32x4_t a, uint32x4_t b)Unsigned halving add

Description

A64 Instruction

UHADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    Elem[result, e, esize] = sum<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vrhadd_s8 (int8x8_t a, int8x8_t b)Signed rounding halving add

Description

Signed Rounding Halving Add. This instruction adds corresponding signed integer values from the two source SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SRHADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrhaddq_s8 (int8x16_t a, int8x16_t b)Signed rounding halving add

Description

A64 Instruction

SRHADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrhadd_s16 (int16x4_t a, int16x4_t b)Signed rounding halving add

Description

A64 Instruction

SRHADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrhaddq_s16 (int16x8_t a, int16x8_t b)Signed rounding halving add

Description

A64 Instruction

SRHADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vrhadd_s32 (int32x2_t a, int32x2_t b)Signed rounding halving add

Description

A64 Instruction

SRHADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vrhaddq_s32 (int32x4_t a, int32x4_t b)Signed rounding halving add

Description

A64 Instruction

SRHADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrhadd_u8 (uint8x8_t a, uint8x8_t b)Unsigned rounding halving add

Description

Unsigned Rounding Halving Add. This instruction adds corresponding unsigned integer values from the two source SIMD&FP registers, shifts each result right one bit, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

URHADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrhaddq_u8 (uint8x16_t a, uint8x16_t b)Unsigned rounding halving add

Description

A64 Instruction

URHADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrhadd_u16 (uint16x4_t a, uint16x4_t b)Unsigned rounding halving add

Description

A64 Instruction

URHADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrhaddq_u16 (uint16x8_t a, uint16x8_t b)Unsigned rounding halving add

Description

A64 Instruction

URHADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrhadd_u32 (uint32x2_t a, uint32x2_t b)Unsigned rounding halving add

Description

A64 Instruction

URHADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrhaddq_u32 (uint32x4_t a, uint32x4_t b)Unsigned rounding halving add

Description

A64 Instruction

URHADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, esize] = (element1+element2+1)<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vqadd_s8 (int8x8_t a, int8x8_t b)Signed saturating add

Description

Signed saturating Add. This instruction adds the values of corresponding elements of the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqaddq_s8 (int8x16_t a, int8x16_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqadd_s16 (int16x4_t a, int16x4_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqaddq_s16 (int16x8_t a, int16x8_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqadd_s32 (int32x2_t a, int32x2_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqaddq_s32 (int32x4_t a, int32x4_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqadd_s64 (int64x1_t a, int64x1_t b)Signed saturating add

Description

A64 Instruction

SQADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqaddq_s64 (int64x2_t a, int64x2_t b)Signed saturating add

Description

A64 Instruction

SQADD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqadd_u8 (uint8x8_t a, uint8x8_t b)Unsigned saturating add

Description

Unsigned saturating Add. This instruction adds the values of corresponding elements of the two source SIMD&FP registers, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UQADD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqaddq_u8 (uint8x16_t a, uint8x16_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqadd_u16 (uint16x4_t a, uint16x4_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqaddq_u16 (uint16x8_t a, uint16x8_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqadd_u32 (uint32x2_t a, uint32x2_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqaddq_u32 (uint32x4_t a, uint32x4_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqadd_u64 (uint64x1_t a, uint64x1_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqaddq_u64 (uint64x2_t a, uint64x2_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8_t vqaddb_s8 (int8_t a, int8_t b)Signed saturating add

Description

A64 Instruction

SQADD Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqaddh_s16 (int16_t a, int16_t b)Signed saturating add

Description

A64 Instruction

SQADD Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqadds_s32 (int32_t a, int32_t b)Signed saturating add

Description

A64 Instruction

SQADD Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqaddd_s64 (int64_t a, int64_t b)Signed saturating add

Description

A64 Instruction

SQADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint8_t vqaddb_u8 (uint8_t a, uint8_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint16_t vqaddh_u16 (uint16_t a, uint16_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint32_t vqadds_u32 (uint32_t a, uint32_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint64_t vqaddd_u64 (uint64_t a, uint64_t b)Unsigned saturating add

Description

A64 Instruction

UQADD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer sum;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    sum = element1 + element2;
    (Elem[result, e, esize], sat) = SatQ(sum, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vuqadd_s8 (int8x8_t a, uint8x8_t b)Signed saturating accumulate of unsigned value

Description

Signed saturating Accumulate of Unsigned value. This instruction adds the unsigned integer values of the vector elements in the source SIMD&FP register to corresponding signed integer values of the vector elements in the destination SIMD&FP register, and writes the resulting signed integer values to the destination SIMD&FP register.

A64 Instruction

SUQADD Vd.8B,Vn.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int8x16_t vuqaddq_s8 (int8x16_t a, uint8x16_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.16B,Vn.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int16x4_t vuqadd_s16 (int16x4_t a, uint16x4_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.4H,Vn.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int16x8_t vuqaddq_s16 (int16x8_t a, uint16x8_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.8H,Vn.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int32x2_t vuqadd_s32 (int32x2_t a, uint32x2_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.2S,Vn.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int32x4_t vuqaddq_s32 (int32x4_t a, uint32x4_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.4S,Vn.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int64x1_t vuqadd_s64 (int64x1_t a, uint64x1_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Dd,Dn

Argument Preparation

a → Dd 

b → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int64x2_t vuqaddq_s64 (int64x2_t a, uint64x2_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Vd.2D,Vn.2D

Argument Preparation

a → Vd.2D 

b → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int8_t vuqaddb_s8 (int8_t a, uint8_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Bd,Bn

Argument Preparation

a → Bd 

b → Bn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int16_t vuqaddh_s16 (int16_t a, uint16_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Hd,Hn

Argument Preparation

a → Hd 

b → Hn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int32_t vuqadds_s32 (int32_t a, uint32_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Sd,Sn

Argument Preparation

a → Sd 

b → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int64_t vuqaddd_s64 (int64_t a, uint64_t b)Signed saturating accumulate of unsigned value

Description

A64 Instruction

SUQADD Dd,Dn

Argument Preparation

a → Dd 

b → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint8x8_t vsqadd_u8 (uint8x8_t a, int8x8_t b)Unsigned saturating accumulate of signed value

Description

Unsigned saturating Accumulate of Signed value. This instruction adds the signed integer values of the vector elements in the source SIMD&FP register to corresponding unsigned integer values of the vector elements in the destination SIMD&FP register, and accumulates the resulting unsigned integer values with the vector elements of the destination SIMD&FP register.

A64 Instruction

USQADD Vd.8B,Vn.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint8x16_t vsqaddq_u8 (uint8x16_t a, int8x16_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.16B,Vn.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint16x4_t vsqadd_u16 (uint16x4_t a, int16x4_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.4H,Vn.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint16x8_t vsqaddq_u16 (uint16x8_t a, int16x8_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.8H,Vn.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint32x2_t vsqadd_u32 (uint32x2_t a, int32x2_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.2S,Vn.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint32x4_t vsqaddq_u32 (uint32x4_t a, int32x4_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.4S,Vn.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint64x1_t vsqadd_u64 (uint64x1_t a, int64x1_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Dd,Dn

Argument Preparation

a → Dd 

b → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint64x2_t vsqaddq_u64 (uint64x2_t a, int64x2_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Vd.2D,Vn.2D

Argument Preparation

a → Vd.2D 

b → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint8_t vsqaddb_u8 (uint8_t a, int8_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Bd,Bn

Argument Preparation

a → Bd 

b → Bn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint16_t vsqaddh_u16 (uint16_t a, int16_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Hd,Hn

Argument Preparation

a → Hd 

b → Hn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint32_t vsqadds_u32 (uint32_t a, int32_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Sd,Sn

Argument Preparation

a → Sd 

b → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

uint64_t vsqaddd_u64 (uint64_t a, int64_t b)Unsigned saturating accumulate of signed value

Description

A64 Instruction

USQADD Dd,Dn

Argument Preparation

a → Dd 

b → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(datasize) operand2 = V[d];
integer op1;
integer op2;
boolean sat;

for e = 0 to elements-1
    op1 = Int(Elem[operand, e, esize], !unsigned);
    op2 = Int(Elem[operand2, e, esize], unsigned);
    (Elem[result, e, esize], sat) = SatQ(op1 + op2, esize, unsigned);
    if sat then FPSR.QC = '1';
V[d] = result;

Supported architectures

A64

int8x8_t vaddhn_s16 (int16x8_t a, int16x8_t b)Add returning high narrow

Description

Add returning High Narrow. This instruction adds each vector element in the first source SIMD&FP register to the corresponding vector element in the second source SIMD&FP register, places the most significant half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register.

A64 Instruction

ADDHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vaddhn_s32 (int32x4_t a, int32x4_t b)Add returning high narrow

Description

A64 Instruction

ADDHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vaddhn_s64 (int64x2_t a, int64x2_t b)Add returning high narrow

Description

A64 Instruction

ADDHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vaddhn_u16 (uint16x8_t a, uint16x8_t b)Add returning high narrow

Description

A64 Instruction

ADDHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vaddhn_u32 (uint32x4_t a, uint32x4_t b)Add returning high narrow

Description

A64 Instruction

ADDHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vaddhn_u64 (uint64x2_t a, uint64x2_t b)Add returning high narrow

Description

A64 Instruction

ADDHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vaddhn_high_s16 (int8x8_t r, int16x8_t a, int16x8_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vaddhn_high_s32 (int16x4_t r, int32x4_t a, int32x4_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vaddhn_high_s64 (int32x2_t r, int64x2_t a, int64x2_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vaddhn_high_u16 (uint8x8_t r, uint16x8_t a, uint16x8_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vaddhn_high_u32 (uint16x4_t r, uint32x4_t a, uint32x4_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vaddhn_high_u64 (uint32x2_t r, uint64x2_t a, uint64x2_t b)Add returning high narrow

Description

A64 Instruction

ADDHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vraddhn_s16 (int16x8_t a, int16x8_t b)Rounding add returning high narrow

Description

Rounding Add returning High Narrow. This instruction adds each vector element in the first source SIMD&FP register to the corresponding vector element in the second source SIMD&FP register, places the most significant half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register.

A64 Instruction

RADDHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vraddhn_s32 (int32x4_t a, int32x4_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vraddhn_s64 (int64x2_t a, int64x2_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vraddhn_u16 (uint16x8_t a, uint16x8_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vraddhn_u32 (uint32x4_t a, uint32x4_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vraddhn_u64 (uint64x2_t a, uint64x2_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vraddhn_high_s16 (int8x8_t r, int16x8_t a, int16x8_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vraddhn_high_s32 (int16x4_t r, int32x4_t a, int32x4_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vraddhn_high_s64 (int32x2_t r, int64x2_t a, int64x2_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vraddhn_high_u16 (uint8x8_t r, uint16x8_t a, uint16x8_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vraddhn_high_u32 (uint16x4_t r, uint32x4_t a, uint32x4_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vraddhn_high_u64 (uint32x2_t r, uint64x2_t a, uint64x2_t b)Rounding add returning high narrow

Description

A64 Instruction

RADDHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vmul_s8 (int8x8_t a, int8x8_t b)Multiply

Description

Multiply (vector). This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

MUL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vmulq_s8 (int8x16_t a, int8x16_t b)Multiply

Description

A64 Instruction

MUL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmul_s16 (int16x4_t a, int16x4_t b)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmulq_s16 (int16x8_t a, int16x8_t b)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmul_s32 (int32x2_t a, int32x2_t b)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmulq_s32 (int32x4_t a, int32x4_t b)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmul_u8 (uint8x8_t a, uint8x8_t b)Multiply

Description

A64 Instruction

MUL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vmulq_u8 (uint8x16_t a, uint8x16_t b)Multiply

Description

A64 Instruction

MUL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmul_u16 (uint16x4_t a, uint16x4_t b)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmulq_u16 (uint16x8_t a, uint16x8_t b)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmul_u32 (uint32x2_t a, uint32x2_t b)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmulq_u32 (uint32x4_t a, uint32x4_t b)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmul_f32 (float32x2_t a, float32x2_t b)Floating-point multiply

Description

Floating-point Multiply (vector). This instruction multiplies corresponding floating-point values in the vectors in the two source SIMD&FP registers, places the result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMUL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vmulq_f32 (float32x4_t a, float32x4_t b)Floating-point multiply

Description

A64 Instruction

FMUL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vmul_p8 (poly8x8_t a, poly8x8_t b)Polynomial multiply

Description

Polynomial Multiply. This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

PMUL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vmulq_p8 (poly8x16_t a, poly8x16_t b)Polynomial multiply

Description

A64 Instruction

PMUL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmul_f64 (float64x1_t a, float64x1_t b)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulq_f64 (float64x2_t a, float64x2_t b)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vmulx_f32 (float32x2_t a, float32x2_t b)Floating-point multiply extended (by element)

Description

Floating-point Multiply extended (by element). This instruction multiplies the floating-point values in the vector elements in the first source SIMD&FP register by the specified floating-point value in the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMULX Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vmulxq_f32 (float32x4_t a, float32x4_t b)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vmulx_f64 (float64x1_t a, float64x1_t b)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulxq_f64 (float64x2_t a, float64x2_t b)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmulxs_f32 (float32_t a, float32_t b)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vmulxd_f64 (float64_t a, float64_t b)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vmulx_lane_f32 (float32x2_t a, float32x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vmulxq_lane_f32 (float32x4_t a, float32x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vmulx_lane_f64 (float64x1_t a, float64x1_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulxq_lane_f64 (float64x2_t a, float64x1_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vn.2D 

v → Vm.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmulxs_lane_f32 (float32_t a, float32x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vmulxd_lane_f64 (float64_t a, float64x1_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vmulx_laneq_f32 (float32x2_t a, float32x4_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vmulxq_laneq_f32 (float32x4_t a, float32x4_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vmulx_laneq_f64 (float64x1_t a, float64x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulxq_laneq_f64 (float64x2_t a, float64x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vn.2D 

v → Vm.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmulxs_laneq_f32 (float32_t a, float32x4_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vmulxd_laneq_f64 (float64_t a, float64x2_t v, const int lane)Floating-point multiply extended (by element)

Description

A64 Instruction

FMULX Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(idxdsize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2 = Elem[operand2, index, esize];

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    if mulx_op then
        Elem[result, e, esize] = FPMulX(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vdiv_f32 (float32x2_t a, float32x2_t b)Floating-point divide

Description

Floating-point Divide (vector). This instruction divides the floating-point values in the elements in the first source SIMD&FP register, by the floating-point values in the corresponding elements in the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FDIV Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPDiv(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vdivq_f32 (float32x4_t a, float32x4_t b)Floating-point divide

Description

A64 Instruction

FDIV Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPDiv(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vdiv_f64 (float64x1_t a, float64x1_t b)Floating-point divide

Description

A64 Instruction

FDIV Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPDiv(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vdivq_f64 (float64x2_t a, float64x2_t b)Floating-point divide

Description

A64 Instruction

FDIV Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPDiv(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8_t vmla_s8 (int8x8_t a, int8x8_t b, int8x8_t c)Multiply-add to accumulator

Description

Multiply-Add to accumulator (vector). This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP registers, and accumulates the results with the vector elements of the destination SIMD&FP register.

A64 Instruction

MLA Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vmlaq_s8 (int8x16_t a, int8x16_t b, int8x16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmla_s16 (int16x4_t a, int16x4_t b, int16x4_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlaq_s16 (int16x8_t a, int16x8_t b, int16x8_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmla_s32 (int32x2_t a, int32x2_t b, int32x2_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlaq_s32 (int32x4_t a, int32x4_t b, int32x4_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmla_u8 (uint8x8_t a, uint8x8_t b, uint8x8_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vmlaq_u8 (uint8x16_t a, uint8x16_t b, uint8x16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmla_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlaq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmla_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlaq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmla_f32 (float32x2_t a, float32x2_t b, float32x2_t c)Floating-point multiply-add to accumulator

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c[i]) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlaq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)Floating-point multiply-add to accumulator

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c[i]) for i = 0 to 3

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float64x1_t vmla_f64 (float64x1_t a, float64x1_t b, float64x1_t c)Floating-point multiply-add to accumulator

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c[i]) for i = 0

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

A64

float64x2_t vmlaq_f64 (float64x2_t a, float64x2_t b, float64x2_t c)Floating-point multiply-add to accumulator

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c[i]) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

A64

int16x8_t vmlal_s8 (int16x8_t a, int8x8_t b, int8x8_t c)Signed multiply-add long

Description

Signed Multiply-Add Long (vector). This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, and accumulates the results with the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

SMLAL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlal_s16 (int32x4_t a, int16x4_t b, int16x4_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlal_s32 (int64x2_t a, int32x2_t b, int32x2_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlal_u8 (uint16x8_t a, uint8x8_t b, uint8x8_t c)Unsigned multiply-add long

Description

Unsigned Multiply-Add Long (vector). This instruction multiplies the vector elements in the lower or upper half of the first source SIMD&FP register by the corresponding vector elements of the second source SIMD&FP register, and accumulates the results with the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

UMLAL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlal_u16 (uint32x4_t a, uint16x4_t b, uint16x4_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlal_u32 (uint64x2_t a, uint32x2_t b, uint32x2_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlal_high_s8 (int16x8_t a, int8x16_t b, int8x16_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlal_high_s16 (int32x4_t a, int16x8_t b, int16x8_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlal_high_s32 (int64x2_t a, int32x4_t b, int32x4_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint16x8_t vmlal_high_u8 (uint16x8_t a, uint8x16_t b, uint8x16_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlal_high_u16 (uint32x4_t a, uint16x8_t b, uint16x8_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlal_high_u32 (uint64x2_t a, uint32x4_t b, uint32x4_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int8x8_t vmls_s8 (int8x8_t a, int8x8_t b, int8x8_t c)Multiply-subtract from accumulator

Description

Multiply-Subtract from accumulator (vector). This instruction multiplies corresponding elements in the vectors of the two source SIMD&FP registers, and subtracts the results from the vector elements of the destination SIMD&FP register.

A64 Instruction

MLS Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vmlsq_s8 (int8x16_t a, int8x16_t b, int8x16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmls_s16 (int16x4_t a, int16x4_t b, int16x4_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlsq_s16 (int16x8_t a, int16x8_t b, int16x8_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmls_s32 (int32x2_t a, int32x2_t b, int32x2_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsq_s32 (int32x4_t a, int32x4_t b, int32x4_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmls_u8 (uint8x8_t a, uint8x8_t b, uint8x8_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vmlsq_u8 (uint8x16_t a, uint8x16_t b, uint8x16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmls_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlsq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmls_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmls_f32 (float32x2_t a, float32x2_t b, float32x2_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c[i]) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlsq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c[i]) for i = 0 to 3

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float64x1_t vmls_f64 (float64x1_t a, float64x1_t b, float64x1_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c[i]) for i = 0

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

A64

float64x2_t vmlsq_f64 (float64x2_t a, float64x2_t b, float64x2_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c[i]) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

A64

int16x8_t vmlsl_s8 (int16x8_t a, int8x8_t b, int8x8_t c)Signed multiply-subtract long

Description

Signed Multiply-Subtract Long (vector). This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, and subtracts the results from the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

SMLSL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsl_s16 (int32x4_t a, int16x4_t b, int16x4_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlsl_s32 (int64x2_t a, int32x2_t b, int32x2_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlsl_u8 (uint16x8_t a, uint8x8_t b, uint8x8_t c)Unsigned multiply-subtract long

Description

Unsigned Multiply-Subtract Long (vector). This instruction multiplies corresponding vector elements in the lower or upper half of the two source SIMD&FP registers, and subtracts the results from the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. All the values in this instruction are unsigned integer values.

A64 Instruction

UMLSL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsl_u16 (uint32x4_t a, uint16x4_t b, uint16x4_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlsl_u32 (uint64x2_t a, uint32x2_t b, uint32x2_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlsl_high_s8 (int16x8_t a, int8x16_t b, int8x16_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlsl_high_s16 (int32x4_t a, int16x8_t b, int16x8_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlsl_high_s32 (int64x2_t a, int32x4_t b, int32x4_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint16x8_t vmlsl_high_u8 (uint16x8_t a, uint8x16_t b, uint8x16_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsl_high_u16 (uint32x4_t a, uint16x8_t b, uint16x8_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlsl_high_u32 (uint64x2_t a, uint32x4_t b, uint32x4_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

float32x2_t vfma_f32 (float32x2_t a, float32x2_t b, float32x2_t c)Floating-point fused multiply-add to accumulator

Description

Floating-point fused Multiply-Add to accumulator (vector). This instruction multiplies corresponding floating-point values in the vectors in the two source SIMD&FP registers, adds the product to the corresponding vector element of the destination SIMD&FP register, and writes the result to the destination SIMD&FP register.

A64 Instruction

FMLA Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vfmaq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vfma_f64 (float64x1_t a, float64x1_t b, float64x1_t c)Floating-point fused multiply-add

Description

Floating-point fused Multiply-Add (scalar). This instruction multiplies the values of the first two SIMD&FP source registers, adds the product to the value of the third SIMD&FP source register, and writes the result to the SIMD&FP destination register.

A64 Instruction

FMADD Dd,Dn,Dm,Da

Argument Preparation

a → Da 

b → Dn 

c → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) result;
bits(datasize) operanda = V[a];
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];

result = FPMulAdd(operanda, operand1, operand2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmaq_f64 (float64x2_t a, float64x2_t b, float64x2_t c)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vd.2D 

b → Vn.2D 

c → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfma_lane_f32 (float32x2_t a, float32x2_t b, float32x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vfmaq_lane_f32 (float32x4_t a, float32x4_t b, float32x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfma_lane_f64 (float64x1_t a, float64x1_t b, float64x1_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmaq_lane_f64 (float64x2_t a, float64x2_t b, float64x1_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

v → Vm.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vfmas_lane_f32 (float32_t a, float32_t b, float32x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sd 

b → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vfmad_lane_f64 (float64_t a, float64_t b, float64x1_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfma_laneq_f32 (float32x2_t a, float32x2_t b, float32x4_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vfmaq_laneq_f32 (float32x4_t a, float32x4_t b, float32x4_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfma_laneq_f64 (float64x1_t a, float64x1_t b, float64x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmaq_laneq_f64 (float64x2_t a, float64x2_t b, float64x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

v → Vm.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vfmas_laneq_f32 (float32_t a, float32_t b, float32x4_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sd 

b → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vfmad_laneq_f64 (float64_t a, float64_t b, float64x2_t v, const int lane)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfms_f32 (float32x2_t a, float32x2_t b, float32x2_t c)Floating-point fused multiply-subtract from accumulator

Description

Floating-point fused Multiply-Subtract from accumulator (vector). This instruction multiplies corresponding floating-point values in the vectors in the two source SIMD&FP registers, negates the product, adds the result to the corresponding vector element of the destination SIMD&FP register, and writes the result to the destination SIMD&FP register.

A64 Instruction

FMLS Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vfmsq_f32 (float32x4_t a, float32x4_t b, float32x4_t c)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vfms_f64 (float64x1_t a, float64x1_t b, float64x1_t c)Floating-point fused multiply-subtract

Description

Floating-point Fused Multiply-Subtract (scalar). This instruction multiplies the values of the first two SIMD&FP source registers, negates the product, adds that to the value of the third SIMD&FP source register, and writes the result to the SIMD&FP destination register.

A64 Instruction

FMSUB Dd,Dn,Dm,Da

Argument Preparation

a → Da 

b → Dn 

c → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) result;
bits(datasize) operanda = V[a];
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];

operand1 = FPNeg(operand1);
result = FPMulAdd(operanda, operand1, operand2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmsq_f64 (float64x2_t a, float64x2_t b, float64x2_t c)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vd.2D 

b → Vn.2D 

c → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfms_lane_f32 (float32x2_t a, float32x2_t b, float32x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vfmsq_lane_f32 (float32x4_t a, float32x4_t b, float32x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfms_lane_f64 (float64x1_t a, float64x1_t b, float64x1_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmsq_lane_f64 (float64x2_t a, float64x2_t b, float64x1_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

v → Vm.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vfmss_lane_f32 (float32_t a, float32_t b, float32x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sd 

b → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vfmsd_lane_f64 (float64_t a, float64_t b, float64x1_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfms_laneq_f32 (float32x2_t a, float32x2_t b, float32x4_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vfmsq_laneq_f32 (float32x4_t a, float32x4_t b, float32x4_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfms_laneq_f64 (float64x1_t a, float64x1_t b, float64x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmsq_laneq_f64 (float64x2_t a, float64x2_t b, float64x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

v → Vm.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vfmss_laneq_f32 (float32_t a, float32_t b, float32x4_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sd 

b → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vfmsd_laneq_f64 (float64_t a, float64_t b, float64x2_t v, const int lane)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dd 

b → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int16x4_t vqdmulh_s16 (int16x4_t a, int16x4_t b)Signed saturating doubling multiply returning high half

Description

Signed saturating Doubling Multiply returning High half. This instruction multiplies the values of corresponding elements of the two source SIMD&FP registers, doubles the results, places the most significant half of the final results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQDMULH Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqdmulhq_s16 (int16x8_t a, int16x8_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqdmulh_s32 (int32x2_t a, int32x2_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmulhq_s32 (int32x4_t a, int32x4_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16_t vqdmulhh_s16 (int16_t a, int16_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmulhs_s32 (int32_t a, int32_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vqrdmulh_s16 (int16x4_t a, int16x4_t b)Signed saturating rounding doubling multiply returning high half

Description

Signed saturating Rounding Doubling Multiply returning High half. This instruction multiplies the values of corresponding elements of the two source SIMD&FP registers, doubles the results, places the most significant half of the final results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQRDMULH Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqrdmulhq_s16 (int16x8_t a, int16x8_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqrdmulh_s32 (int32x2_t a, int32x2_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqrdmulhq_s32 (int32x4_t a, int32x4_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16_t vqrdmulhh_s16 (int16_t a, int16_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqrdmulhs_s32 (int32_t a, int32_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_s16 (int32x4_t a, int16x4_t b, int16x4_t c)Signed saturating doubling multiply-add long

Description

Signed saturating Doubling Multiply-Add Long. This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, doubles the results, and accumulates the final results with the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

SQDMLAL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlal_s32 (int64x2_t a, int32x2_t b, int32x2_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmlalh_s16 (int32_t a, int16_t b, int16_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Sd,Hn,Hm

Argument Preparation

a → Sd 

b → Hn 

c → Hm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlals_s32 (int64_t a, int32_t b, int32_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Dd,Sn,Sm

Argument Preparation

a → Dd 

b → Sn 

c → Sm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_high_s16 (int32x4_t a, int16x8_t b, int16x8_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlal_high_s32 (int64x2_t a, int32x4_t b, int32x4_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_s16 (int32x4_t a, int16x4_t b, int16x4_t c)Signed saturating doubling multiply-subtract long

Description

Signed saturating Doubling Multiply-Subtract Long. This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, doubles the results, and subtracts the final results from the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

SQDMLSL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlsl_s32 (int64x2_t a, int32x2_t b, int32x2_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmlslh_s16 (int32_t a, int16_t b, int16_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Sd,Hn,Hm

Argument Preparation

a → Sd 

b → Hn 

c → Hm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlsls_s32 (int64_t a, int32_t b, int32_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Dd,Sn,Sm

Argument Preparation

a → Dd 

b → Sn 

c → Sm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_high_s16 (int32x4_t a, int16x8_t b, int16x8_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlsl_high_s32 (int64x2_t a, int32x4_t b, int32x4_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x8_t vmull_s8 (int8x8_t a, int8x8_t b)Signed multiply long

Description

Signed Multiply Long (vector). This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SMULL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmull_s16 (int16x4_t a, int16x4_t b)Signed multiply long

Description

A64 Instruction

SMULL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmull_s32 (int32x2_t a, int32x2_t b)Signed multiply long

Description

A64 Instruction

SMULL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmull_u8 (uint8x8_t a, uint8x8_t b)Unsigned multiply long

Description

Unsigned Multiply long (vector). This instruction multiplies corresponding vector elements in the lower or upper half of the two source SIMD&FP registers, places the result in a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied. All the values in this instruction are unsigned integer values.

A64 Instruction

UMULL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmull_u16 (uint16x4_t a, uint16x4_t b)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmull_u32 (uint32x2_t a, uint32x2_t b)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vmull_p8 (poly8x8_t a, poly8x8_t b)Polynomial multiply long

Description

Polynomial Multiply Long. This instruction multiplies corresponding elements in the lower or upper half of the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.

A64 Instruction

PMULL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, 2*esize] = PolynomialMult(element1, element2);

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmull_high_s8 (int8x16_t a, int8x16_t b)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vmull_high_s16 (int16x8_t a, int16x8_t b)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmull_high_s32 (int32x4_t a, int32x4_t b)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vmull_high_u8 (uint8x16_t a, uint8x16_t b)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmull_high_u16 (uint16x8_t a, uint16x8_t b)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmull_high_u32 (uint32x4_t a, uint32x4_t b)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

poly16x8_t vmull_high_p8 (poly8x16_t a, poly8x16_t b)Polynomial multiply long

Description

A64 Instruction

PMULL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, 2*esize] = PolynomialMult(element1, element2);

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_s16 (int16x4_t a, int16x4_t b)Signed saturating doubling multiply long

Description

Signed saturating Doubling Multiply Long. This instruction multiplies corresponding vector elements in the lower or upper half of the two source SIMD&FP registers, doubles the results, places the final results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQDMULL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmull_s32 (int32x2_t a, int32x2_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmullh_s16 (int16_t a, int16_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Sd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmulls_s32 (int32_t a, int32_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Dd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_high_s16 (int16x8_t a, int16x8_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmull_high_s32 (int32x4_t a, int32x4_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vsub_s8 (int8x8_t a, int8x8_t b)Subtract

Description

Subtract (vector). This instruction subtracts each vector element in the second source SIMD&FP register from the corresponding vector element in the first source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vsubq_s8 (int8x16_t a, int8x16_t b)Subtract

Description

A64 Instruction

SUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vsub_s16 (int16x4_t a, int16x4_t b)Subtract

Description

A64 Instruction

SUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsubq_s16 (int16x8_t a, int16x8_t b)Subtract

Description

A64 Instruction

SUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vsub_s32 (int32x2_t a, int32x2_t b)Subtract

Description

A64 Instruction

SUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsubq_s32 (int32x4_t a, int32x4_t b)Subtract

Description

A64 Instruction

SUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vsub_s64 (int64x1_t a, int64x1_t b)Subtract

Description

A64 Instruction

SUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsubq_s64 (int64x2_t a, int64x2_t b)Subtract

Description

A64 Instruction

SUB Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vsub_u8 (uint8x8_t a, uint8x8_t b)Subtract

Description

A64 Instruction

SUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vsubq_u8 (uint8x16_t a, uint8x16_t b)Subtract

Description

A64 Instruction

SUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vsub_u16 (uint16x4_t a, uint16x4_t b)Subtract

Description

A64 Instruction

SUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsubq_u16 (uint16x8_t a, uint16x8_t b)Subtract

Description

A64 Instruction

SUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vsub_u32 (uint32x2_t a, uint32x2_t b)Subtract

Description

A64 Instruction

SUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsubq_u32 (uint32x4_t a, uint32x4_t b)Subtract

Description

A64 Instruction

SUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vsub_u64 (uint64x1_t a, uint64x1_t b)Subtract

Description

A64 Instruction

SUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsubq_u64 (uint64x2_t a, uint64x2_t b)Subtract

Description

A64 Instruction

SUB Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vsub_f32 (float32x2_t a, float32x2_t b)Floating-point subtract

Description

Floating-point Subtract (vector). This instruction subtracts the elements in the vector in the second source SIMD&FP register, from the corresponding elements in the vector in the first source SIMD&FP register, places each result into elements of a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FSUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vsubq_f32 (float32x4_t a, float32x4_t b)Floating-point subtract

Description

A64 Instruction

FSUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vsub_f64 (float64x1_t a, float64x1_t b)Floating-point subtract

Description

A64 Instruction

FSUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

float64x2_t vsubq_f64 (float64x2_t a, float64x2_t b)Floating-point subtract

Description

A64 Instruction

FSUB Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

int64_t vsubd_s64 (int64_t a, int64_t b)Subtract

Description

A64 Instruction

SUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint64_t vsubd_u64 (uint64_t a, uint64_t b)Subtract

Description

A64 Instruction

SUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then
        Elem[result, e, esize] = element1 - element2;
    else
        Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int16x8_t vsubl_s8 (int8x8_t a, int8x8_t b)Signed subtract long

Description

Signed Subtract Long. This instruction subtracts each vector element in the lower or upper half of the second source SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed integer values. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SSUBL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsubl_s16 (int16x4_t a, int16x4_t b)Signed subtract long

Description

A64 Instruction

SSUBL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsubl_s32 (int32x2_t a, int32x2_t b)Signed subtract long

Description

A64 Instruction

SSUBL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsubl_u8 (uint8x8_t a, uint8x8_t b)Unsigned subtract long

Description

Unsigned Subtract Long. This instruction subtracts each vector element in the lower or upper half of the second source SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

USUBL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsubl_u16 (uint16x4_t a, uint16x4_t b)Unsigned subtract long

Description

A64 Instruction

USUBL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsubl_u32 (uint32x2_t a, uint32x2_t b)Unsigned subtract long

Description

A64 Instruction

USUBL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsubl_high_s8 (int8x16_t a, int8x16_t b)Signed subtract long

Description

A64 Instruction

SSUBL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vsubl_high_s16 (int16x8_t a, int16x8_t b)Signed subtract long

Description

A64 Instruction

SSUBL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vsubl_high_s32 (int32x4_t a, int32x4_t b)Signed subtract long

Description

A64 Instruction

SSUBL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vsubl_high_u8 (uint8x16_t a, uint8x16_t b)Unsigned subtract long

Description

A64 Instruction

USUBL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vsubl_high_u16 (uint16x8_t a, uint16x8_t b)Unsigned subtract long

Description

A64 Instruction

USUBL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vsubl_high_u32 (uint32x4_t a, uint32x4_t b)Unsigned subtract long

Description

A64 Instruction

USUBL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int16x8_t vsubw_s8 (int16x8_t a, int8x8_t b)Signed subtract wide

Description

Signed Subtract Wide. This instruction subtracts each vector element in the lower or upper half of the second source SIMD&FP register from the corresponding vector element in the first source SIMD&FP register, places the result in a vector, and writes the vector to the SIMD&FP destination register. All the values in this instruction are signed integer values.

A64 Instruction

SSUBW Vd.8H,Vn.8H,Vm.8B

Argument Preparation

a → Vn.8H 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsubw_s16 (int32x4_t a, int16x4_t b)Signed subtract wide

Description

A64 Instruction

SSUBW Vd.4S,Vn.4S,Vm.4H

Argument Preparation

a → Vn.4S 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsubw_s32 (int64x2_t a, int32x2_t b)Signed subtract wide

Description

A64 Instruction

SSUBW Vd.2D,Vn.2D,Vm.2S

Argument Preparation

a → Vn.2D 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsubw_u8 (uint16x8_t a, uint8x8_t b)Unsigned subtract wide

Description

Unsigned Subtract Wide. This instruction subtracts each vector element of the second source SIMD&FP register from the corresponding vector element in the lower or upper half of the first source SIMD&FP register, places the result in a vector, and writes the vector to the SIMD&FP destination register. All the values in this instruction are signed integer values.

A64 Instruction

USUBW Vd.8H,Vn.8H,Vm.8B

Argument Preparation

a → Vn.8H 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsubw_u16 (uint32x4_t a, uint16x4_t b)Unsigned subtract wide

Description

A64 Instruction

USUBW Vd.4S,Vn.4S,Vm.4H

Argument Preparation

a → Vn.4S 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsubw_u32 (uint64x2_t a, uint32x2_t b)Unsigned subtract wide

Description

A64 Instruction

USUBW Vd.2D,Vn.2D,Vm.2S

Argument Preparation

a → Vn.2D 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsubw_high_s8 (int16x8_t a, int8x16_t b)Signed subtract wide

Description

A64 Instruction

SSUBW2 Vd.8H,Vn.8H,Vm.16B

Argument Preparation

a → Vn.8H 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vsubw_high_s16 (int32x4_t a, int16x8_t b)Signed subtract wide

Description

A64 Instruction

SSUBW2 Vd.4S,Vn.4S,Vm.8H

Argument Preparation

a → Vn.4S 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vsubw_high_s32 (int64x2_t a, int32x4_t b)Signed subtract wide

Description

A64 Instruction

SSUBW2 Vd.2D,Vn.2D,Vm.4S

Argument Preparation

a → Vn.2D 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vsubw_high_u8 (uint16x8_t a, uint8x16_t b)Unsigned subtract wide

Description

A64 Instruction

USUBW2 Vd.8H,Vn.8H,Vm.16B

Argument Preparation

a → Vn.8H 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vsubw_high_u16 (uint32x4_t a, uint16x8_t b)Unsigned subtract wide

Description

A64 Instruction

USUBW2 Vd.4S,Vn.4S,Vm.8H

Argument Preparation

a → Vn.4S 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vsubw_high_u32 (uint64x2_t a, uint32x4_t b)Unsigned subtract wide

Description

A64 Instruction

USUBW2 Vd.2D,Vn.2D,Vm.4S

Argument Preparation

a → Vn.2D 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
integer sum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, 2*esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    Elem[result, e, 2*esize] = sum<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vhsub_s8 (int8x8_t a, int8x8_t b)Signed halving subtract

Description

Signed Halving Subtract. This instruction subtracts the elements in the vector in the second source SIMD&FP register from the corresponding elements in the vector in the first source SIMD&FP register, shifts each result right one bit, places each result into elements of a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SHSUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vhsubq_s8 (int8x16_t a, int8x16_t b)Signed halving subtract

Description

A64 Instruction

SHSUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vhsub_s16 (int16x4_t a, int16x4_t b)Signed halving subtract

Description

A64 Instruction

SHSUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vhsubq_s16 (int16x8_t a, int16x8_t b)Signed halving subtract

Description

A64 Instruction

SHSUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vhsub_s32 (int32x2_t a, int32x2_t b)Signed halving subtract

Description

A64 Instruction

SHSUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vhsubq_s32 (int32x4_t a, int32x4_t b)Signed halving subtract

Description

A64 Instruction

SHSUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vhsub_u8 (uint8x8_t a, uint8x8_t b)Unsigned halving subtract

Description

Unsigned Halving Subtract. This instruction subtracts the vector elements in the second source SIMD&FP register from the corresponding vector elements in the first source SIMD&FP register, shifts each result right one bit, places each result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UHSUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vhsubq_u8 (uint8x16_t a, uint8x16_t b)Unsigned halving subtract

Description

A64 Instruction

UHSUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vhsub_u16 (uint16x4_t a, uint16x4_t b)Unsigned halving subtract

Description

A64 Instruction

UHSUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vhsubq_u16 (uint16x8_t a, uint16x8_t b)Unsigned halving subtract

Description

A64 Instruction

UHSUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vhsub_u32 (uint32x2_t a, uint32x2_t b)Unsigned halving subtract

Description

A64 Instruction

UHSUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vhsubq_u32 (uint32x4_t a, uint32x4_t b)Unsigned halving subtract

Description

A64 Instruction

UHSUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    Elem[result, e, esize] = diff<esize:1>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vqsub_s8 (int8x8_t a, int8x8_t b)Signed saturating subtract

Description

Signed saturating Subtract. This instruction subtracts the element values of the second source SIMD&FP register from the corresponding element values of the first source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQSUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqsubq_s8 (int8x16_t a, int8x16_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqsub_s16 (int16x4_t a, int16x4_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqsubq_s16 (int16x8_t a, int16x8_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqsub_s32 (int32x2_t a, int32x2_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqsubq_s32 (int32x4_t a, int32x4_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqsub_s64 (int64x1_t a, int64x1_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqsubq_s64 (int64x2_t a, int64x2_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqsub_u8 (uint8x8_t a, uint8x8_t b)Unsigned saturating subtract

Description

Unsigned saturating Subtract. This instruction subtracts the element values of the second source SIMD&FP register from the corresponding element values of the first source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UQSUB Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqsubq_u8 (uint8x16_t a, uint8x16_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqsub_u16 (uint16x4_t a, uint16x4_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqsubq_u16 (uint16x8_t a, uint16x8_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqsub_u32 (uint32x2_t a, uint32x2_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqsubq_u32 (uint32x4_t a, uint32x4_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqsub_u64 (uint64x1_t a, uint64x1_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqsubq_u64 (uint64x2_t a, uint64x2_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8_t vqsubb_s8 (int8_t a, int8_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqsubh_s16 (int16_t a, int16_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqsubs_s32 (int32_t a, int32_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqsubd_s64 (int64_t a, int64_t b)Signed saturating subtract

Description

A64 Instruction

SQSUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint8_t vqsubb_u8 (uint8_t a, uint8_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint16_t vqsubh_u16 (uint16_t a, uint16_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint32_t vqsubs_u32 (uint32_t a, uint32_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint64_t vqsubd_u64 (uint64_t a, uint64_t b)Unsigned saturating subtract

Description

A64 Instruction

UQSUB Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer diff;
boolean sat;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    diff = element1 - element2;
    (Elem[result, e, esize], sat) = SatQ(diff, esize, unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vsubhn_s16 (int16x8_t a, int16x8_t b)Subtract returning high narrow

Description

Subtract returning High Narrow. This instruction subtracts each vector element in the second source SIMD&FP register from the corresponding vector element in the first source SIMD&FP register, places the most significant half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction are signed integer values.

A64 Instruction

SUBHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vsubhn_s32 (int32x4_t a, int32x4_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vsubhn_s64 (int64x2_t a, int64x2_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vsubhn_u16 (uint16x8_t a, uint16x8_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vsubhn_u32 (uint32x4_t a, uint32x4_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vsubhn_u64 (uint64x2_t a, uint64x2_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vsubhn_high_s16 (int8x8_t r, int16x8_t a, int16x8_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vsubhn_high_s32 (int16x4_t r, int32x4_t a, int32x4_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vsubhn_high_s64 (int32x2_t r, int64x2_t a, int64x2_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vsubhn_high_u16 (uint8x8_t r, uint16x8_t a, uint16x8_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vsubhn_high_u32 (uint16x4_t r, uint32x4_t a, uint32x4_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vsubhn_high_u64 (uint32x2_t r, uint64x2_t a, uint64x2_t b)Subtract returning high narrow

Description

A64 Instruction

SUBHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vrsubhn_s16 (int16x8_t a, int16x8_t b)Rounding subtract returning high narrow

Description

Rounding Subtract returning High Narrow. This instruction subtracts each vector element of the second source SIMD&FP register from the corresponding vector element of the first source SIMD&FP register, places the most significant half of the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register.

A64 Instruction

RSUBHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vrsubhn_s32 (int32x4_t a, int32x4_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vrsubhn_s64 (int64x2_t a, int64x2_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrsubhn_u16 (uint16x8_t a, uint16x8_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN Vd.8B,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrsubhn_u32 (uint32x4_t a, uint32x4_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN Vd.4H,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrsubhn_u64 (uint64x2_t a, uint64x2_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN Vd.2S,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vrsubhn_high_s16 (int8x8_t r, int16x8_t a, int16x8_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vrsubhn_high_s32 (int16x4_t r, int32x4_t a, int32x4_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vrsubhn_high_s64 (int32x2_t r, int64x2_t a, int64x2_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vrsubhn_high_u16 (uint8x8_t r, uint16x8_t a, uint16x8_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.16B,Vn.8H,Vm.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H 

b → Vm.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vrsubhn_high_u32 (uint16x4_t r, uint32x4_t a, uint32x4_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.8H,Vn.4S,Vm.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S 

b → Vm.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vrsubhn_high_u64 (uint32x2_t r, uint64x2_t a, uint64x2_t b)Rounding subtract returning high narrow

Description

A64 Instruction

RSUBHN2 Vd.4S,Vn.2D,Vm.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D 

b → Vm.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand1 = V[n];
bits(2*datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if round then 1 << (esize - 1) else 0;
bits(2*esize) element1;
bits(2*esize) element2;
bits(2*esize) sum;

for e = 0 to elements-1
    element1 = Elem[operand1, e, 2*esize];
    element2 = Elem[operand2, e, 2*esize];
    if sub_op then
        sum = element1 - element2;
    else
        sum = element1 + element2;
    sum = sum + round_const;
    Elem[result, e, esize] = sum<2*esize-1:esize>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x8_t vceq_s8 (int8x8_t a, int8x8_t b)Compare bitwise equal to zero

Description

Compare bitwise Equal to zero (vector). This instruction reads each vector element in the source SIMD&FP register and if the value is equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMEQ Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vceqq_s8 (int8x16_t a, int8x16_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vceq_s16 (int16x4_t a, int16x4_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vceqq_s16 (int16x8_t a, int16x8_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vceq_s32 (int32x2_t a, int32x2_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vceqq_s32 (int32x4_t a, int32x4_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vceq_u8 (uint8x8_t a, uint8x8_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vceqq_u8 (uint8x16_t a, uint8x16_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vceq_u16 (uint16x4_t a, uint16x4_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vceqq_u16 (uint16x8_t a, uint16x8_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vceq_u32 (uint32x2_t a, uint32x2_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vceqq_u32 (uint32x4_t a, uint32x4_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vceq_f32 (float32x2_t a, float32x2_t b)Floating-point compare equal to zero

Description

Floating-point Compare Equal to zero (vector). This instruction reads each floating-point value in the source SIMD&FP register and if the value is equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FCMEQ Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vceqq_f32 (float32x4_t a, float32x4_t b)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vceq_p8 (poly8x8_t a, poly8x8_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vceqq_p8 (poly8x16_t a, poly8x16_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vceq_s64 (int64x1_t a, int64x1_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqq_s64 (int64x2_t a, int64x2_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vceq_u64 (uint64x1_t a, uint64x1_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqq_u64 (uint64x2_t a, uint64x2_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vceq_p64 (poly64x1_t a, poly64x1_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64x2_t vceqq_p64 (poly64x2_t a, poly64x2_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64x1_t vceq_f64 (float64x1_t a, float64x1_t b)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqq_f64 (float64x2_t a, float64x2_t b)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqd_s64 (int64_t a, int64_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqd_u64 (uint64_t a, uint64_t b)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vceqs_f32 (float32_t a, float32_t b)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqd_f64 (float64_t a, float64_t b)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vceqz_s8 (int8x8_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vceqzq_s8 (int8x16_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vceqz_s16 (int16x4_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vceqzq_s16 (int16x8_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vceqz_s32 (int32x2_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vceqzq_s32 (int32x4_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vceqz_u8 (uint8x8_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vceqzq_u8 (uint8x16_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vceqz_u16 (uint16x4_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vceqzq_u16 (uint16x8_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vceqz_u32 (uint32x2_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vceqzq_u32 (uint32x4_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vceqz_f32 (float32x2_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vceqzq_f32 (float32x4_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vceqz_p8 (poly8x8_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vceqzq_p8 (poly8x16_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vceqz_s64 (int64x1_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqzq_s64 (int64x2_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vceqz_u64 (uint64x1_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqzq_u64 (uint64x2_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vceqz_p64 (poly64x1_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64x2_t vceqzq_p64 (poly64x2_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64x1_t vceqz_f64 (float64x1_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vceqzq_f64 (float64x2_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqzd_s64 (int64_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqzd_u64 (uint64_t a)Compare bitwise equal to zero

Description

A64 Instruction

CMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vceqzs_f32 (float32_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Sd,Sn,#0

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vceqzd_f64 (float64_t a)Floating-point compare equal to zero

Description

A64 Instruction

FCMEQ Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcge_s8 (int8x8_t a, int8x8_t b)Compare signed greater than or equal to zero

Description

Compare signed Greater than or Equal to zero (vector). This instruction reads each vector element in the source SIMD&FP register and if the signed integer value is greater than or equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMGE Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcgeq_s8 (int8x16_t a, int8x16_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcge_s16 (int16x4_t a, int16x4_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcgeq_s16 (int16x8_t a, int16x8_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcge_s32 (int32x2_t a, int32x2_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgeq_s32 (int32x4_t a, int32x4_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vcge_u8 (uint8x8_t a, uint8x8_t b)Compare unsigned higher or same

Description

Compare unsigned Higher or Same (vector). This instruction compares each vector element in the first source SIMD&FP register with the corresponding vector element in the second source SIMD&FP register and if the first unsigned integer value is greater than or equal to the second unsigned integer value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMHS Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcgeq_u8 (uint8x16_t a, uint8x16_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcge_u16 (uint16x4_t a, uint16x4_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcgeq_u16 (uint16x8_t a, uint16x8_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcge_u32 (uint32x2_t a, uint32x2_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgeq_u32 (uint32x4_t a, uint32x4_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcge_f32 (float32x2_t a, float32x2_t b)Floating-point compare greater than or equal to zero

Description

Floating-point Compare Greater than or Equal to zero (vector). This instruction reads each floating-point value in the source SIMD&FP register and if the value is greater than or equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FCMGE Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgeq_f32 (float32x4_t a, float32x4_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcge_s64 (int64x1_t a, int64x1_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgeq_s64 (int64x2_t a, int64x2_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcge_u64 (uint64x1_t a, uint64x1_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgeq_u64 (uint64x2_t a, uint64x2_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcge_f64 (float64x1_t a, float64x1_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgeq_f64 (float64x2_t a, float64x2_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcged_s64 (int64_t a, int64_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcged_u64 (uint64_t a, uint64_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcges_f32 (float32_t a, float32_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcged_f64 (float64_t a, float64_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcgez_s8 (int8x8_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vcgezq_s8 (int8x16_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vcgez_s16 (int16x4_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vcgezq_s16 (int16x8_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcgez_s32 (int32x2_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcgezq_s32 (int32x4_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgez_s64 (int64x1_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgezq_s64 (int64x2_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcgez_f32 (float32x2_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcgezq_f32 (float32x4_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgez_f64 (float64x1_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgezq_f64 (float64x2_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgezd_s64 (int64_t a)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcgezs_f32 (float32_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Sd,Sn,#0

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgezd_f64 (float64_t a)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcle_s8 (int8x8_t a, int8x8_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcleq_s8 (int8x16_t a, int8x16_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcle_s16 (int16x4_t a, int16x4_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcleq_s16 (int16x8_t a, int16x8_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcle_s32 (int32x2_t a, int32x2_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcleq_s32 (int32x4_t a, int32x4_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vcle_u8 (uint8x8_t a, uint8x8_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcleq_u8 (uint8x16_t a, uint8x16_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcle_u16 (uint16x4_t a, uint16x4_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcleq_u16 (uint16x8_t a, uint16x8_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcle_u32 (uint32x2_t a, uint32x2_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcleq_u32 (uint32x4_t a, uint32x4_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcle_f32 (float32x2_t a, float32x2_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcleq_f32 (float32x4_t a, float32x4_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcle_s64 (int64x1_t a, int64x1_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcleq_s64 (int64x2_t a, int64x2_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcle_u64 (uint64x1_t a, uint64x1_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcleq_u64 (uint64x2_t a, uint64x2_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcle_f64 (float64x1_t a, float64x1_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcleq_f64 (float64x2_t a, float64x2_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcled_s64 (int64_t a, int64_t b)Compare signed greater than or equal to zero

Description

A64 Instruction

CMGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcled_u64 (uint64_t a, uint64_t b)Compare unsigned higher or same

Description

A64 Instruction

CMHS Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcles_f32 (float32_t a, float32_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Sd,Sm,Sn

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcled_f64 (float64_t a, float64_t b)Floating-point compare greater than or equal to zero

Description

A64 Instruction

FCMGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vclez_s8 (int8x8_t a)Compare signed less than or equal to zero

Description

Compare signed Less than or Equal to zero (vector). This instruction reads each vector element in the source SIMD&FP register and if the signed integer value is less than or equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMLE Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vclezq_s8 (int8x16_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vclez_s16 (int16x4_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vclezq_s16 (int16x8_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vclez_s32 (int32x2_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vclezq_s32 (int32x4_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vclez_s64 (int64x1_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vclezq_s64 (int64x2_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vclez_f32 (float32x2_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vclezq_f32 (float32x4_t a)Floating-point compare less than or equal to zero

Description

Floating-point Compare Less than or Equal to zero (vector). This instruction reads each floating-point value in the source SIMD&FP register and if the value is less than or equal to zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FCMLE Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vclez_f64 (float64x1_t a)Floating-point compare less than or equal to zero

Description

A64 Instruction

FCMLE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vclezq_f64 (float64x2_t a)Floating-point compare less than or equal to zero

Description

A64 Instruction

FCMLE Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vclezd_s64 (int64_t a)Compare signed less than or equal to zero

Description

A64 Instruction

CMLE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vclezs_f32 (float32_t a)Floating-point compare less than or equal to zero

Description

A64 Instruction

FCMLE Sd,Sn,#0

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vclezd_f64 (float64_t a)Floating-point compare less than or equal to zero

Description

A64 Instruction

FCMLE Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcgt_s8 (int8x8_t a, int8x8_t b)Compare signed greater than zero

Description

Compare signed Greater than zero (vector). This instruction reads each vector element in the source SIMD&FP register and if the signed integer value is greater than zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMGT Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcgtq_s8 (int8x16_t a, int8x16_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcgt_s16 (int16x4_t a, int16x4_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcgtq_s16 (int16x8_t a, int16x8_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcgt_s32 (int32x2_t a, int32x2_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgtq_s32 (int32x4_t a, int32x4_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vcgt_u8 (uint8x8_t a, uint8x8_t b)Compare unsigned higher

Description

Compare unsigned Higher (vector). This instruction compares each vector element in the first source SIMD&FP register with the corresponding vector element in the second source SIMD&FP register and if the first unsigned integer value is greater than the second unsigned integer value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMHI Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcgtq_u8 (uint8x16_t a, uint8x16_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcgt_u16 (uint16x4_t a, uint16x4_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcgtq_u16 (uint16x8_t a, uint16x8_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcgt_u32 (uint32x2_t a, uint32x2_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgtq_u32 (uint32x4_t a, uint32x4_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcgt_f32 (float32x2_t a, float32x2_t b)Floating-point compare greater than zero

Description

Floating-point Compare Greater than zero (vector). This instruction reads each floating-point value in the source SIMD&FP register and if the value is greater than zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FCMGT Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcgtq_f32 (float32x4_t a, float32x4_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcgt_s64 (int64x1_t a, int64x1_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgtq_s64 (int64x2_t a, int64x2_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgt_u64 (uint64x1_t a, uint64x1_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgtq_u64 (uint64x2_t a, uint64x2_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgt_f64 (float64x1_t a, float64x1_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgtq_f64 (float64x2_t a, float64x2_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgtd_s64 (int64_t a, int64_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgtd_u64 (uint64_t a, uint64_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcgts_f32 (float32_t a, float32_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgtd_f64 (float64_t a, float64_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcgtz_s8 (int8x8_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vcgtzq_s8 (int8x16_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vcgtz_s16 (int16x4_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vcgtzq_s16 (int16x8_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcgtz_s32 (int32x2_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcgtzq_s32 (int32x4_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgtz_s64 (int64x1_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgtzq_s64 (int64x2_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcgtz_f32 (float32x2_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcgtzq_f32 (float32x4_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcgtz_f64 (float64x1_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcgtzq_f64 (float64x2_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgtzd_s64 (int64_t a)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcgtzs_f32 (float32_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Sd,Sn,#0

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcgtzd_f64 (float64_t a)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vclt_s8 (int8x8_t a, int8x8_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcltq_s8 (int8x16_t a, int8x16_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vclt_s16 (int16x4_t a, int16x4_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcltq_s16 (int16x8_t a, int16x8_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vclt_s32 (int32x2_t a, int32x2_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcltq_s32 (int32x4_t a, int32x4_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vclt_u8 (uint8x8_t a, uint8x8_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.8B,Vm.8B,Vn.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcltq_u8 (uint8x16_t a, uint8x16_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.16B,Vm.16B,Vn.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vclt_u16 (uint16x4_t a, uint16x4_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.4H,Vm.4H,Vn.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcltq_u16 (uint16x8_t a, uint16x8_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.8H,Vm.8H,Vn.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vclt_u32 (uint32x2_t a, uint32x2_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcltq_u32 (uint32x4_t a, uint32x4_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vclt_f32 (float32x2_t a, float32x2_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcltq_f32 (float32x4_t a, float32x4_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vclt_s64 (int64x1_t a, int64x1_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcltq_s64 (int64x2_t a, int64x2_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vclt_u64 (uint64x1_t a, uint64x1_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcltq_u64 (uint64x2_t a, uint64x2_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vclt_f64 (float64x1_t a, float64x1_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcltq_f64 (float64x2_t a, float64x2_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcltd_s64 (int64_t a, int64_t b)Compare signed greater than zero

Description

A64 Instruction

CMGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcltd_u64 (uint64_t a, uint64_t b)Compare unsigned higher

Description

A64 Instruction

CMHI Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    test_passed = if cmp_eq then element1 >= element2 else element1 > element2;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vclts_f32 (float32_t a, float32_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Sd,Sm,Sn

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcltd_f64 (float64_t a, float64_t b)Floating-point compare greater than zero

Description

A64 Instruction

FCMGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vcltz_s8 (int8x8_t a)Compare signed less than zero

Description

Compare signed Less than zero (vector). This instruction reads each vector element in the source SIMD&FP register and if the signed integer value is less than zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMLT Vd.8B,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x16_t vcltzq_s8 (int8x16_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.16B,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x4_t vcltz_s16 (int16x4_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.4H,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint16x8_t vcltzq_s16 (int16x8_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.8H,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcltz_s32 (int32x2_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcltzq_s32 (int32x4_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcltz_s64 (int64x1_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcltzq_s64 (int64x2_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcltz_f32 (float32x2_t a)Floating-point compare less than zero

Description

Floating-point Compare Less than zero (vector). This instruction reads each floating-point value in the source SIMD&FP register and if the value is less than zero sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FCMLT Vd.2S,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x4_t vcltzq_f32 (float32x4_t a)Floating-point compare less than zero

Description

A64 Instruction

FCMLT Vd.4S,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vcltz_f64 (float64x1_t a)Floating-point compare less than zero

Description

A64 Instruction

FCMLT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcltzq_f64 (float64x2_t a)Floating-point compare less than zero

Description

A64 Instruction

FCMLT Vd.2D,Vn.2D,#0

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcltzd_s64 (int64_t a)Compare signed less than zero

Description

A64 Instruction

CMLT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean test_passed;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    case comparison of
        when CompareOp_GT test_passed = element > 0;
        when CompareOp_GE test_passed = element >= 0;
        when CompareOp_EQ test_passed = element == 0;
        when CompareOp_LE test_passed = element <= 0;
        when CompareOp_LT test_passed = element < 0;
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcltzs_f32 (float32_t a)Floating-point compare less than zero

Description

A64 Instruction

FCMLT Sd,Sn,#0

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcltzd_f64 (float64_t a)Floating-point compare less than zero

Description

A64 Instruction

FCMLT Dd,Dn,#0

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) zero = FPZero('0');
bits(esize) element;
boolean test_passed;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    case comparison of
        when CompareOp_GT test_passed = FPCompareGT(element, zero, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element, zero, FPCR);
        when CompareOp_EQ test_passed = FPCompareEQ(element, zero, FPCR);
        when CompareOp_LE test_passed = FPCompareGE(zero, element, FPCR);
        when CompareOp_LT test_passed = FPCompareGT(zero, element, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcage_f32 (float32x2_t a, float32x2_t b)Floating-point absolute compare greater than or equal

Description

Floating-point Absolute Compare Greater than or Equal (vector). This instruction compares the absolute value of each floating-point value in the first source SIMD&FP register with the absolute value of the corresponding floating-point value in the second source SIMD&FP register and if the first value is greater than or equal to the second value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FACGE Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcageq_f32 (float32x4_t a, float32x4_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcage_f64 (float64x1_t a, float64x1_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcageq_f64 (float64x2_t a, float64x2_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcages_f32 (float32_t a, float32_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcaged_f64 (float64_t a, float64_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcale_f32 (float32x2_t a, float32x2_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcaleq_f32 (float32x4_t a, float32x4_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcale_f64 (float64x1_t a, float64x1_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcaleq_f64 (float64x2_t a, float64x2_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Vd.2D,Vm.2D,Vn.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcales_f32 (float32_t a, float32_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Sd,Sm,Sn

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcaled_f64 (float64_t a, float64_t b)Floating-point absolute compare greater than or equal

Description

A64 Instruction

FACGE Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcagt_f32 (float32x2_t a, float32x2_t b)Floating-point absolute compare greater than

Description

Floating-point Absolute Compare Greater than (vector). This instruction compares the absolute value of each vector element in the first source SIMD&FP register with the absolute value of the corresponding vector element in the second source SIMD&FP register and if the first value is greater than the second value sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

FACGT Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcagtq_f32 (float32x4_t a, float32x4_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcagt_f64 (float64x1_t a, float64x1_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcagtq_f64 (float64x2_t a, float64x2_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcagts_f32 (float32_t a, float32_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcagtd_f64 (float64_t a, float64_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32x2_t vcalt_f32 (float32x2_t a, float32x2_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Vd.2S,Vm.2S,Vn.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcaltq_f32 (float32x4_t a, float32x4_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Vd.4S,Vm.4S,Vn.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcalt_f64 (float64x1_t a, float64x1_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vcaltq_f64 (float64x2_t a, float64x2_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint32_t vcalts_f32 (float32_t a, float32_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Sd,Sm,Sn

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vcaltd_f64 (float64_t a, float64_t b)Floating-point absolute compare greater than

Description

A64 Instruction

FACGT Dd,Dm,Dn

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if abs then
        element1 = FPAbs(element1);
        element2 = FPAbs(element2);
    case cmp of
        when CompareOp_EQ test_passed = FPCompareEQ(element1, element2, FPCR);
        when CompareOp_GE test_passed = FPCompareGE(element1, element2, FPCR);
        when CompareOp_GT test_passed = FPCompareGT(element1, element2, FPCR);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint8x8_t vtst_s8 (int8x8_t a, int8x8_t b)Compare bitwise test bits nonzero

Description

Compare bitwise Test bits nonzero (vector). This instruction reads each vector element in the first source SIMD&FP register, performs an AND with the corresponding vector element in the second source SIMD&FP register, and if the result is not zero, sets every bit of the corresponding vector element in the destination SIMD&FP register to one, otherwise sets every bit of the corresponding vector element in the destination SIMD&FP register to zero.

A64 Instruction

CMTST Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vtstq_s8 (int8x16_t a, int8x16_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vtst_s16 (int16x4_t a, int16x4_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vtstq_s16 (int16x8_t a, int16x8_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vtst_s32 (int32x2_t a, int32x2_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vtstq_s32 (int32x4_t a, int32x4_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtst_u8 (uint8x8_t a, uint8x8_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vtstq_u8 (uint8x16_t a, uint8x16_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vtst_u16 (uint16x4_t a, uint16x4_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vtstq_u16 (uint16x8_t a, uint16x8_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vtst_u32 (uint32x2_t a, uint32x2_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vtstq_u32 (uint32x4_t a, uint32x4_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtst_p8 (poly8x8_t a, poly8x8_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vtstq_p8 (poly8x16_t a, poly8x16_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vtst_s64 (int64x1_t a, int64x1_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vtstq_s64 (int64x2_t a, int64x2_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vtst_u64 (uint64x1_t a, uint64x1_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x2_t vtstq_u64 (uint64x2_t a, uint64x2_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64x1_t vtst_p64 (poly64x1_t a, poly64x1_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64x2_t vtstq_p64 (poly64x2_t a, poly64x2_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A32/A64

uint64_t vtstd_s64 (int64_t a, int64_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

uint64_t vtstd_u64 (uint64_t a, uint64_t b)Compare bitwise test bits nonzero

Description

A64 Instruction

CMTST Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
boolean test_passed;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if and_test then
        test_passed = !IsZero(element1 AND element2);
    else
        test_passed = (element1 == element2);
    Elem[result, e, esize] = if test_passed then Ones() else Zeros();

V[d] = result;

Supported architectures

A64

int8x8_t vabd_s8 (int8x8_t a, int8x8_t b)Signed absolute difference

Description

Signed Absolute Difference. This instruction subtracts the elements of the vector of the second source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, places the the absolute values of the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SABD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vabdq_s8 (int8x16_t a, int8x16_t b)Signed absolute difference

Description

A64 Instruction

SABD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vabd_s16 (int16x4_t a, int16x4_t b)Signed absolute difference

Description

A64 Instruction

SABD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabdq_s16 (int16x8_t a, int16x8_t b)Signed absolute difference

Description

A64 Instruction

SABD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vabd_s32 (int32x2_t a, int32x2_t b)Signed absolute difference

Description

A64 Instruction

SABD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vabdq_s32 (int32x4_t a, int32x4_t b)Signed absolute difference

Description

A64 Instruction

SABD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vabd_u8 (uint8x8_t a, uint8x8_t b)Unsigned absolute difference

Description

Unsigned Absolute Difference (vector). This instruction subtracts the elements of the vector of the second source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, places the the absolute values of the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UABD Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vabdq_u8 (uint8x16_t a, uint8x16_t b)Unsigned absolute difference

Description

A64 Instruction

UABD Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vabd_u16 (uint16x4_t a, uint16x4_t b)Unsigned absolute difference

Description

A64 Instruction

UABD Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vabdq_u16 (uint16x8_t a, uint16x8_t b)Unsigned absolute difference

Description

A64 Instruction

UABD Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vabd_u32 (uint32x2_t a, uint32x2_t b)Unsigned absolute difference

Description

A64 Instruction

UABD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vabdq_u32 (uint32x4_t a, uint32x4_t b)Unsigned absolute difference

Description

A64 Instruction

UABD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vabd_f32 (float32x2_t a, float32x2_t b)Floating-point absolute difference

Description

Floating-point Absolute Difference (vector). This instruction subtracts the floating-point values in the elements of the second source SIMD&FP register, from the corresponding floating-point values in the elements of the first source SIMD&FP register, places the absolute value of each result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FABD Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vabdq_f32 (float32x4_t a, float32x4_t b)Floating-point absolute difference

Description

A64 Instruction

FABD Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vabd_f64 (float64x1_t a, float64x1_t b)Floating-point absolute difference

Description

A64 Instruction

FABD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

float64x2_t vabdq_f64 (float64x2_t a, float64x2_t b)Floating-point absolute difference

Description

A64 Instruction

FABD Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

float32_t vabds_f32 (float32_t a, float32_t b)Floating-point absolute difference

Description

A64 Instruction

FABD Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

float64_t vabdd_f64 (float64_t a, float64_t b)Floating-point absolute difference

Description

A64 Instruction

FABD Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) diff;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    diff = FPSub(element1, element2, FPCR);
    Elem[result, e, esize] = if abs then FPAbs(diff) else diff;

V[d] = result;

Supported architectures

A64

int16x8_t vabdl_s8 (int8x8_t a, int8x8_t b)Signed absolute difference long

Description

Signed Absolute Difference Long. This instruction subtracts the vector elements of the second source SIMD&FP register from the corresponding vector elements of the first source SIMD&FP register, places the absolute value of the results into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SABDL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vabdl_s16 (int16x4_t a, int16x4_t b)Signed absolute difference long

Description

A64 Instruction

SABDL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vabdl_s32 (int32x2_t a, int32x2_t b)Signed absolute difference long

Description

A64 Instruction

SABDL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vabdl_u8 (uint8x8_t a, uint8x8_t b)Unsigned absolute difference long

Description

Unsigned Absolute Difference Long. This instruction subtracts the vector elements in the lower or upper half of the second source SIMD&FP register from the corresponding vector elements of the first source SIMD&FP register, places the absolute value of the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values in this instruction are unsigned integer values.

A64 Instruction

UABDL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vabdl_u16 (uint16x4_t a, uint16x4_t b)Unsigned absolute difference long

Description

A64 Instruction

UABDL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vabdl_u32 (uint32x2_t a, uint32x2_t b)Unsigned absolute difference long

Description

A64 Instruction

UABDL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabdl_high_s8 (int8x16_t a, int8x16_t b)Signed absolute difference long

Description

A64 Instruction

SABDL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int32x4_t vabdl_high_s16 (int16x8_t a, int16x8_t b)Signed absolute difference long

Description

A64 Instruction

SABDL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int64x2_t vabdl_high_s32 (int32x4_t a, int32x4_t b)Signed absolute difference long

Description

A64 Instruction

SABDL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint16x8_t vabdl_high_u8 (uint8x16_t a, uint8x16_t b)Unsigned absolute difference long

Description

A64 Instruction

UABDL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint32x4_t vabdl_high_u16 (uint16x8_t a, uint16x8_t b)Unsigned absolute difference long

Description

A64 Instruction

UABDL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint64x2_t vabdl_high_u32 (uint32x4_t a, uint32x4_t b)Unsigned absolute difference long

Description

A64 Instruction

UABDL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int8x8_t vaba_s8 (int8x8_t a, int8x8_t b, int8x8_t c)Signed absolute difference and accumulate

Description

Signed Absolute difference and Accumulate. This instruction subtracts the elements of the vector of the second source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, and accumulates the absolute values of the results into the elements of the vector of the destination SIMD&FP register.

A64 Instruction

SABA Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vabaq_s8 (int8x16_t a, int8x16_t b, int8x16_t c)Signed absolute difference and accumulate

Description

A64 Instruction

SABA Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vaba_s16 (int16x4_t a, int16x4_t b, int16x4_t c)Signed absolute difference and accumulate

Description

A64 Instruction

SABA Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabaq_s16 (int16x8_t a, int16x8_t b, int16x8_t c)Signed absolute difference and accumulate

Description

A64 Instruction

SABA Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vaba_s32 (int32x2_t a, int32x2_t b, int32x2_t c)Signed absolute difference and accumulate

Description

A64 Instruction

SABA Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vabaq_s32 (int32x4_t a, int32x4_t b, int32x4_t c)Signed absolute difference and accumulate

Description

A64 Instruction

SABA Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vaba_u8 (uint8x8_t a, uint8x8_t b, uint8x8_t c)Unsigned absolute difference and accumulate

Description

Unsigned Absolute difference and Accumulate. This instruction subtracts the elements of the vector of the second source SIMD&FP register from the corresponding elements of the first source SIMD&FP register, and accumulates the absolute values of the results into the elements of the vector of the destination SIMD&FP register.

A64 Instruction

UABA Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vabaq_u8 (uint8x16_t a, uint8x16_t b, uint8x16_t c)Unsigned absolute difference and accumulate

Description

A64 Instruction

UABA Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vaba_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t c)Unsigned absolute difference and accumulate

Description

A64 Instruction

UABA Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vabaq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t c)Unsigned absolute difference and accumulate

Description

A64 Instruction

UABA Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vaba_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)Unsigned absolute difference and accumulate

Description

A64 Instruction

UABA Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vabaq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t c)Unsigned absolute difference and accumulate

Description

A64 Instruction

UABA Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
bits(esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<esize-1:0>;
    Elem[result, e, esize] = Elem[result, e, esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabal_s8 (int16x8_t a, int8x8_t b, int8x8_t c)Signed absolute difference and accumulate long

Description

Signed Absolute difference and Accumulate Long. This instruction subtracts the vector elements in the lower or upper half of the second source SIMD&FP register from the corresponding vector elements of the first source SIMD&FP register, and accumulates the absolute values of the results into the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SABAL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vabal_s16 (int32x4_t a, int16x4_t b, int16x4_t c)Signed absolute difference and accumulate long

Description

A64 Instruction

SABAL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vabal_s32 (int64x2_t a, int32x2_t b, int32x2_t c)Signed absolute difference and accumulate long

Description

A64 Instruction

SABAL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vabal_u8 (uint16x8_t a, uint8x8_t b, uint8x8_t c)Unsigned absolute difference and accumulate long

Description

Unsigned Absolute difference and Accumulate Long. This instruction subtracts the vector elements in the lower or upper half of the second source SIMD&FP register from the corresponding vector elements of the first source SIMD&FP register, and accumulates the absolute values of the results into the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values in this instruction are unsigned integer values.

A64 Instruction

UABAL Vd.8H,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8H 

b → Vn.8B 

c → Vm.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vabal_u16 (uint32x4_t a, uint16x4_t b, uint16x4_t c)Unsigned absolute difference and accumulate long

Description

A64 Instruction

UABAL Vd.4S,Vn.4H,Vm.4H

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vabal_u32 (uint64x2_t a, uint32x2_t b, uint32x2_t c)Unsigned absolute difference and accumulate long

Description

A64 Instruction

UABAL Vd.2D,Vn.2S,Vm.2S

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabal_high_s8 (int16x8_t a, int8x16_t b, int8x16_t c)Signed absolute difference and accumulate long

Description

A64 Instruction

SABAL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int32x4_t vabal_high_s16 (int32x4_t a, int16x8_t b, int16x8_t c)Signed absolute difference and accumulate long

Description

A64 Instruction

SABAL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int64x2_t vabal_high_s32 (int64x2_t a, int32x4_t b, int32x4_t c)Signed absolute difference and accumulate long

Description

A64 Instruction

SABAL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint16x8_t vabal_high_u8 (uint16x8_t a, uint8x16_t b, uint8x16_t c)Unsigned absolute difference and accumulate long

Description

A64 Instruction

UABAL2 Vd.8H,Vn.16B,Vm.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B 

c → Vm.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint32x4_t vabal_high_u16 (uint32x4_t a, uint16x8_t b, uint16x8_t c)Unsigned absolute difference and accumulate long

Description

A64 Instruction

UABAL2 Vd.4S,Vn.8H,Vm.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

uint64x2_t vabal_high_u32 (uint64x2_t a, uint32x4_t b, uint32x4_t c)Unsigned absolute difference and accumulate long

Description

A64 Instruction

UABAL2 Vd.2D,Vn.4S,Vm.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) absdiff;

result = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    absdiff = Abs(element1-element2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + absdiff;
V[d] = result;

Supported architectures

A64

int8x8_t vmax_s8 (int8x8_t a, int8x8_t b)Signed maximum

Description

Signed Maximum (vector). This instruction compares corresponding elements in the vectors in the two source SIMD&FP registers, places the larger of each pair of signed integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SMAX Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vmaxq_s8 (int8x16_t a, int8x16_t b)Signed maximum

Description

A64 Instruction

SMAX Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmax_s16 (int16x4_t a, int16x4_t b)Signed maximum

Description

A64 Instruction

SMAX Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmaxq_s16 (int16x8_t a, int16x8_t b)Signed maximum

Description

A64 Instruction

SMAX Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmax_s32 (int32x2_t a, int32x2_t b)Signed maximum

Description

A64 Instruction

SMAX Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmaxq_s32 (int32x4_t a, int32x4_t b)Signed maximum

Description

A64 Instruction

SMAX Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmax_u8 (uint8x8_t a, uint8x8_t b)Unsigned maximum

Description

Unsigned Maximum (vector). This instruction compares corresponding elements in the vectors in the two source SIMD&FP registers, places the larger of each pair of unsigned integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UMAX Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vmaxq_u8 (uint8x16_t a, uint8x16_t b)Unsigned maximum

Description

A64 Instruction

UMAX Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmax_u16 (uint16x4_t a, uint16x4_t b)Unsigned maximum

Description

A64 Instruction

UMAX Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmaxq_u16 (uint16x8_t a, uint16x8_t b)Unsigned maximum

Description

A64 Instruction

UMAX Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmax_u32 (uint32x2_t a, uint32x2_t b)Unsigned maximum

Description

A64 Instruction

UMAX Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmaxq_u32 (uint32x4_t a, uint32x4_t b)Unsigned maximum

Description

A64 Instruction

UMAX Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmax_f32 (float32x2_t a, float32x2_t b)Floating-point maximum

Description

Floating-point Maximum (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, places the larger of each of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMAX Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vmaxq_f32 (float32x4_t a, float32x4_t b)Floating-point maximum

Description

A64 Instruction

FMAX Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmax_f64 (float64x1_t a, float64x1_t b)Floating-point maximum

Description

A64 Instruction

FMAX Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmaxq_f64 (float64x2_t a, float64x2_t b)Floating-point maximum

Description

A64 Instruction

FMAX Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8_t vmin_s8 (int8x8_t a, int8x8_t b)Signed minimum

Description

Signed Minimum (vector). This instruction compares corresponding elements in the vectors in the two source SIMD&FP registers, places the smaller of each of the two signed integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SMIN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vminq_s8 (int8x16_t a, int8x16_t b)Signed minimum

Description

A64 Instruction

SMIN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmin_s16 (int16x4_t a, int16x4_t b)Signed minimum

Description

A64 Instruction

SMIN Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vminq_s16 (int16x8_t a, int16x8_t b)Signed minimum

Description

A64 Instruction

SMIN Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmin_s32 (int32x2_t a, int32x2_t b)Signed minimum

Description

A64 Instruction

SMIN Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vminq_s32 (int32x4_t a, int32x4_t b)Signed minimum

Description

A64 Instruction

SMIN Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmin_u8 (uint8x8_t a, uint8x8_t b)Unsigned minimum

Description

Unsigned Minimum (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, places the smaller of each of the two unsigned integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UMIN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vminq_u8 (uint8x16_t a, uint8x16_t b)Unsigned minimum

Description

A64 Instruction

UMIN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmin_u16 (uint16x4_t a, uint16x4_t b)Unsigned minimum

Description

A64 Instruction

UMIN Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vminq_u16 (uint16x8_t a, uint16x8_t b)Unsigned minimum

Description

A64 Instruction

UMIN Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmin_u32 (uint32x2_t a, uint32x2_t b)Unsigned minimum

Description

A64 Instruction

UMIN Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vminq_u32 (uint32x4_t a, uint32x4_t b)Unsigned minimum

Description

A64 Instruction

UMIN Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmin_f32 (float32x2_t a, float32x2_t b)Floating-point minimum

Description

Floating-point minimum (vector). This instruction compares corresponding elements in the vectors in the two source SIMD&FP registers, places the smaller of each of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMIN Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vminq_f32 (float32x4_t a, float32x4_t b)Floating-point minimum

Description

A64 Instruction

FMIN Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmin_f64 (float64x1_t a, float64x1_t b)Floating-point minimum

Description

A64 Instruction

FMIN Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vminq_f64 (float64x2_t a, float64x2_t b)Floating-point minimum

Description

A64 Instruction

FMIN Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vmaxnm_f32 (float32x2_t a, float32x2_t b)Floating-point maximum number

Description

Floating-point Maximum Number (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, writes the larger of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMAXNM Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vmaxnmq_f32 (float32x4_t a, float32x4_t b)Floating-point maximum number

Description

A64 Instruction

FMAXNM Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vmaxnm_f64 (float64x1_t a, float64x1_t b)Floating-point maximum number

Description

A64 Instruction

FMAXNM Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmaxnmq_f64 (float64x2_t a, float64x2_t b)Floating-point maximum number

Description

A64 Instruction

FMAXNM Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vminnm_f32 (float32x2_t a, float32x2_t b)Floating-point minimum number

Description

Floating-point Minimum Number (vector). This instruction compares corresponding vector elements in the two source SIMD&FP registers, writes the smaller of the two floating-point values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FMINNM Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vminnmq_f32 (float32x4_t a, float32x4_t b)Floating-point minimum number

Description

A64 Instruction

FMINNM Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vminnm_f64 (float64x1_t a, float64x1_t b)Floating-point minimum number

Description

A64 Instruction

FMINNM Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vminnmq_f64 (float64x2_t a, float64x2_t b)Floating-point minimum number

Description

A64 Instruction

FMINNM Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8_t vshl_s8 (int8x8_t a, int8x8_t b)Signed shift left

Description

Signed Shift Left (register). This instruction takes each signed integer value in the vector of the first source SIMD&FP register, shifts each value by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vshlq_s8 (int8x16_t a, int8x16_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vshl_s16 (int16x4_t a, int16x4_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vshlq_s16 (int16x8_t a, int16x8_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vshl_s32 (int32x2_t a, int32x2_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vshlq_s32 (int32x4_t a, int32x4_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vshl_s64 (int64x1_t a, int64x1_t b)Signed shift left

Description

A64 Instruction

SSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vshlq_s64 (int64x2_t a, int64x2_t b)Signed shift left

Description

A64 Instruction

SSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vshl_u8 (uint8x8_t a, int8x8_t b)Unsigned shift left

Description

Unsigned Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP register, shifts each element by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

USHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vshlq_u8 (uint8x16_t a, int8x16_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vshl_u16 (uint16x4_t a, int16x4_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vshlq_u16 (uint16x8_t a, int16x8_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vshl_u32 (uint32x2_t a, int32x2_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vshlq_u32 (uint32x4_t a, int32x4_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vshl_u64 (uint64x1_t a, int64x1_t b)Unsigned shift left

Description

A64 Instruction

USHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vshlq_u64 (uint64x2_t a, int64x2_t b)Unsigned shift left

Description

A64 Instruction

USHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vshld_s64 (int64_t a, int64_t b)Signed shift left

Description

A64 Instruction

SSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vshld_u64 (uint64_t a, int64_t b)Unsigned shift left

Description

A64 Instruction

USHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vqshl_s8 (int8x8_t a, int8x8_t b)Signed saturating shift left

Description

Signed saturating Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP register, shifts each element by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqshlq_s8 (int8x16_t a, int8x16_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqshl_s16 (int16x4_t a, int16x4_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqshlq_s16 (int16x8_t a, int16x8_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqshl_s32 (int32x2_t a, int32x2_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqshlq_s32 (int32x4_t a, int32x4_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqshl_s64 (int64x1_t a, int64x1_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqshlq_s64 (int64x2_t a, int64x2_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqshl_u8 (uint8x8_t a, int8x8_t b)Unsigned saturating shift left

Description

Unsigned saturating Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP register, shifts the element by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UQSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqshlq_u8 (uint8x16_t a, int8x16_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqshl_u16 (uint16x4_t a, int16x4_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqshlq_u16 (uint16x8_t a, int16x8_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqshl_u32 (uint32x2_t a, int32x2_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqshlq_u32 (uint32x4_t a, int32x4_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqshl_u64 (uint64x1_t a, int64x1_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqshlq_u64 (uint64x2_t a, int64x2_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8_t vqshlb_s8 (int8_t a, int8_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int16_t vqshlh_s16 (int16_t a, int16_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32_t vqshls_s32 (int32_t a, int32_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64_t vqshld_s64 (int64_t a, int64_t b)Signed saturating shift left

Description

A64 Instruction

SQSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8_t vqshlb_u8 (uint8_t a, int8_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16_t vqshlh_u16 (uint16_t a, int16_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32_t vqshls_u32 (uint32_t a, int32_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vqshld_u64 (uint64_t a, int64_t b)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vrshl_s8 (int8x8_t a, int8x8_t b)Signed rounding shift left

Description

Signed Rounding Shift Left (register). This instruction takes each signed integer value in the vector of the first source SIMD&FP register, shifts it by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SRSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrshlq_s8 (int8x16_t a, int8x16_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrshl_s16 (int16x4_t a, int16x4_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrshlq_s16 (int16x8_t a, int16x8_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vrshl_s32 (int32x2_t a, int32x2_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vrshlq_s32 (int32x4_t a, int32x4_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vrshl_s64 (int64x1_t a, int64x1_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vrshlq_s64 (int64x2_t a, int64x2_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrshl_u8 (uint8x8_t a, int8x8_t b)Unsigned rounding shift left

Description

Unsigned Rounding Shift Left (register). This instruction takes each element in the vector of the first source SIMD&FP register, shifts the vector element by a value from the least significant byte of the corresponding element of the second source SIMD&FP register, places the results in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

URSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrshlq_u8 (uint8x16_t a, int8x16_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrshl_u16 (uint16x4_t a, int16x4_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrshlq_u16 (uint16x8_t a, int16x8_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrshl_u32 (uint32x2_t a, int32x2_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrshlq_u32 (uint32x4_t a, int32x4_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vrshl_u64 (uint64x1_t a, int64x1_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vrshlq_u64 (uint64x2_t a, int64x2_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vrshld_s64 (int64_t a, int64_t b)Signed rounding shift left

Description

A64 Instruction

SRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vrshld_u64 (uint64_t a, int64_t b)Unsigned rounding shift left

Description

A64 Instruction

URSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vqrshl_s8 (int8x8_t a, int8x8_t b)Signed saturating rounding shift left

Description

Signed saturating Rounding Shift Left (register). This instruction takes each vector element in the first source SIMD&FP register, shifts it by a value from the least significant byte of the corresponding vector element of the second source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SQRSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqrshlq_s8 (int8x16_t a, int8x16_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqrshl_s16 (int16x4_t a, int16x4_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqrshlq_s16 (int16x8_t a, int16x8_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqrshl_s32 (int32x2_t a, int32x2_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqrshlq_s32 (int32x4_t a, int32x4_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqrshl_s64 (int64x1_t a, int64x1_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqrshlq_s64 (int64x2_t a, int64x2_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqrshl_u8 (uint8x8_t a, int8x8_t b)Unsigned saturating rounding shift left

Description

Unsigned saturating Rounding Shift Left (register). This instruction takes each vector element of the first source SIMD&FP register, shifts the vector element by a value from the least significant byte of the corresponding vector element of the second source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UQRSHL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqrshlq_u8 (uint8x16_t a, int8x16_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqrshl_u16 (uint16x4_t a, int16x4_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqrshlq_u16 (uint16x8_t a, int16x8_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqrshl_u32 (uint32x2_t a, int32x2_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqrshlq_u32 (uint32x4_t a, int32x4_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqrshl_u64 (uint64x1_t a, int64x1_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqrshlq_u64 (uint64x2_t a, int64x2_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8_t vqrshlb_s8 (int8_t a, int8_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int16_t vqrshlh_s16 (int16_t a, int16_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32_t vqrshls_s32 (int32_t a, int32_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64_t vqrshld_s64 (int64_t a, int64_t b)Signed saturating rounding shift left

Description

A64 Instruction

SQRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8_t vqrshlb_u8 (uint8_t a, int8_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Bd,Bn,Bm

Argument Preparation

a → Bn 

b → Bm

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16_t vqrshlh_u16 (uint16_t a, int16_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Hd,Hn,Hm

Argument Preparation

a → Hn 

b → Hm

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32_t vqrshls_u32 (uint32_t a, int32_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vqrshld_u64 (uint64_t a, int64_t b)Unsigned saturating rounding shift left

Description

A64 Instruction

UQRSHL Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vshr_n_s8 (int8x8_t a, const int n)Signed shift right

Description

Signed Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, places the final result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed integer values. The results are truncated. For rounded results, see SRSHR.

A64 Instruction

SSHR Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vshrq_n_s8 (int8x16_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vshr_n_s16 (int16x4_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vshrq_n_s16 (int16x8_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vshr_n_s32 (int32x2_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vshrq_n_s32 (int32x4_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vshr_n_s64 (int64x1_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vshrq_n_s64 (int64x2_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vshr_n_u8 (uint8x8_t a, const int n)Unsigned shift right

Description

Unsigned Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are truncated. For rounded results, see URSHR.

A64 Instruction

USHR Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vshrq_n_u8 (uint8x16_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vshr_n_u16 (uint16x4_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vshrq_n_u16 (uint16x8_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vshr_n_u32 (uint32x2_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vshrq_n_u32 (uint32x4_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vshr_n_u64 (uint64x1_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vshrq_n_u64 (uint64x2_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vshrd_n_s64 (int64_t a, const int n)Signed shift right

Description

A64 Instruction

SSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vshrd_n_u64 (uint64_t a, const int n)Unsigned shift right

Description

A64 Instruction

USHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vshl_n_s8 (int8x8_t a, const int n)Shift left

Description

Shift Left (immediate). This instruction reads each value from a vector, left shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SHL Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vshlq_n_s8 (int8x16_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vshl_n_s16 (int16x4_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vshlq_n_s16 (int16x8_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vshl_n_s32 (int32x2_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vshlq_n_s32 (int32x4_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vshl_n_s64 (int64x1_t a, const int n)Shift left

Description

A64 Instruction

SHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vshlq_n_s64 (int64x2_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vshl_n_u8 (uint8x8_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vshlq_n_u8 (uint8x16_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vshl_n_u16 (uint16x4_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vshlq_n_u16 (uint16x8_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vshl_n_u32 (uint32x2_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vshlq_n_u32 (uint32x4_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vshl_n_u64 (uint64x1_t a, const int n)Shift left

Description

A64 Instruction

SHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vshlq_n_u64 (uint64x2_t a, const int n)Shift left

Description

A64 Instruction

SHL Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vshld_n_s64 (int64_t a, const int n)Shift left

Description

A64 Instruction

SHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

A64

uint64_t vshld_n_u64 (uint64_t a, const int n)Shift left

Description

A64 Instruction

SHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = LSL(Elem[operand, e, esize], shift);

V[d] = result;

Supported architectures

A64

int8x8_t vrshr_n_s8 (int8x8_t a, const int n)Signed rounding shift right

Description

Signed Rounding Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, places the final result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed integer values. The results are rounded. For truncated results, see SSHR.

A64 Instruction

SRSHR Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrshrq_n_s8 (int8x16_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrshr_n_s16 (int16x4_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrshrq_n_s16 (int16x8_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vrshr_n_s32 (int32x2_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vrshrq_n_s32 (int32x4_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vrshr_n_s64 (int64x1_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vrshrq_n_s64 (int64x2_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrshr_n_u8 (uint8x8_t a, const int n)Unsigned rounding shift right

Description

Unsigned Rounding Shift Right (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are rounded. For truncated results, see USHR.

A64 Instruction

URSHR Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrshrq_n_u8 (uint8x16_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrshr_n_u16 (uint16x4_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrshrq_n_u16 (uint16x8_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrshr_n_u32 (uint32x2_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrshrq_n_u32 (uint32x4_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vrshr_n_u64 (uint64x1_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vrshrq_n_u64 (uint64x2_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vrshrd_n_s64 (int64_t a, const int n)Signed rounding shift right

Description

A64 Instruction

SRSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vrshrd_n_u64 (uint64_t a, const int n)Unsigned rounding shift right

Description

A64 Instruction

URSHR Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vsra_n_s8 (int8x8_t a, int8x8_t b, const int n)Signed shift right and accumulate

Description

Signed Shift Right and Accumulate (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector elements of the destination SIMD&FP register. All the values in this instruction are signed integer values. The results are truncated. For rounded results, see SRSRA.

A64 Instruction

SSRA Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vsraq_n_s8 (int8x16_t a, int8x16_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vsra_n_s16 (int16x4_t a, int16x4_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsraq_n_s16 (int16x8_t a, int16x8_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vsra_n_s32 (int32x2_t a, int32x2_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsraq_n_s32 (int32x4_t a, int32x4_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vsra_n_s64 (int64x1_t a, int64x1_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsraq_n_s64 (int64x2_t a, int64x2_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vsra_n_u8 (uint8x8_t a, uint8x8_t b, const int n)Unsigned shift right and accumulate

Description

Unsigned Shift Right and Accumulate (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are truncated. For rounded results, see URSRA.

A64 Instruction

USRA Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vsraq_n_u8 (uint8x16_t a, uint8x16_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vsra_n_u16 (uint16x4_t a, uint16x4_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsraq_n_u16 (uint16x8_t a, uint16x8_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vsra_n_u32 (uint32x2_t a, uint32x2_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsraq_n_u32 (uint32x4_t a, uint32x4_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vsra_n_u64 (uint64x1_t a, uint64x1_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsraq_n_u64 (uint64x2_t a, uint64x2_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vsrad_n_s64 (int64_t a, int64_t b, const int n)Signed shift right and accumulate

Description

A64 Instruction

SSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vsrad_n_u64 (uint64_t a, uint64_t b, const int n)Unsigned shift right and accumulate

Description

A64 Instruction

USRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vrsra_n_s8 (int8x8_t a, int8x8_t b, const int n)Signed rounding shift right and accumulate

Description

Signed Rounding Shift Right and Accumulate (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector elements of the destination SIMD&FP register. All the values in this instruction are signed integer values. The results are rounded. For truncated results, see SSRA.

A64 Instruction

SRSRA Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrsraq_n_s8 (int8x16_t a, int8x16_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrsra_n_s16 (int16x4_t a, int16x4_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrsraq_n_s16 (int16x8_t a, int16x8_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vrsra_n_s32 (int32x2_t a, int32x2_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vrsraq_n_s32 (int32x4_t a, int32x4_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vrsra_n_s64 (int64x1_t a, int64x1_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vrsraq_n_s64 (int64x2_t a, int64x2_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrsra_n_u8 (uint8x8_t a, uint8x8_t b, const int n)Unsigned rounding shift right and accumulate

Description

Unsigned Rounding Shift Right and Accumulate (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, and accumulates the final results with the vector elements of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are rounded. For truncated results, see USRA.

A64 Instruction

URSRA Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrsraq_n_u8 (uint8x16_t a, uint8x16_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrsra_n_u16 (uint16x4_t a, uint16x4_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrsraq_n_u16 (uint16x8_t a, uint16x8_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrsra_n_u32 (uint32x2_t a, uint32x2_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrsraq_n_u32 (uint32x4_t a, uint32x4_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vrsra_n_u64 (uint64x1_t a, uint64x1_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vrsraq_n_u64 (uint64x2_t a, uint64x2_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64_t vrsrad_n_s64 (int64_t a, int64_t b, const int n)Signed rounding shift right and accumulate

Description

A64 Instruction

SRSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vrsrad_n_u64 (uint64_t a, uint64_t b, const int n)Unsigned rounding shift right and accumulate

Description

A64 Instruction

URSRA Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2;
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

operand2 = if accumulate then V[d] else Zeros();
for e = 0 to elements-1
    element = (Int(Elem[operand, e, esize], unsigned) + round_const) >> shift;
    Elem[result, e, esize] = Elem[operand2, e, esize] + element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vqshl_n_s8 (int8x8_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqshlq_n_s8 (int8x16_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqshl_n_s16 (int16x4_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqshlq_n_s16 (int16x8_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqshl_n_s32 (int32x2_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqshlq_n_s32 (int32x4_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqshl_n_s64 (int64x1_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqshlq_n_s64 (int64x2_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqshl_n_u8 (uint8x8_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqshlq_n_u8 (uint8x16_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqshl_n_u16 (uint16x4_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqshlq_n_u16 (uint16x8_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqshl_n_u32 (uint32x2_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqshlq_n_u32 (uint32x4_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqshl_n_u64 (uint64x1_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqshlq_n_u64 (uint64x2_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8_t vqshlb_n_s8 (int8_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Bd,Bn,#n

Argument Preparation

a → Bn 

0 << n << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int16_t vqshlh_n_s16 (int16_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Hd,Hn,#n

Argument Preparation

a → Hn 

0 << n << 15

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32_t vqshls_n_s32 (int32_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Sd,Sn,#n

Argument Preparation

a → Sn 

0 << n << 31

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64_t vqshld_n_s64 (int64_t a, const int n)Signed saturating shift left

Description

A64 Instruction

SQSHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8_t vqshlb_n_u8 (uint8_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Bd,Bn,#n

Argument Preparation

a → Bn 

0 << n << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16_t vqshlh_n_u16 (uint16_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Hd,Hn,#n

Argument Preparation

a → Hn 

0 << n << 15

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32_t vqshls_n_u32 (uint32_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Sd,Sn,#n

Argument Preparation

a → Sn 

0 << n << 31

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64_t vqshld_n_u64 (uint64_t a, const int n)Unsigned saturating shift left

Description

A64 Instruction

UQSHL Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer round_const = 0;
integer shift;
integer element;
boolean sat;

for e = 0 to elements-1
    shift = SInt(Elem[operand2, e, esize]<7:0>);
    if rounding then
        round_const = 1 << (-shift - 1);    // 0 for left shift, 2^(n-1) for right shift 
    element = (Int(Elem[operand1, e, esize], unsigned) + round_const) << shift;
    if saturating then
        (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
        if sat then FPSR.QC = '1';
    else
        Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8x8_t vqshlu_n_s8 (int8x8_t a, const int n)Signed saturating shift left unsigned

Description

Signed saturating Shift Left Unsigned (immediate). This instruction reads each signed integer value in the vector of the source SIMD&FP register, shifts each value by an immediate value, saturates the shifted result to an unsigned integer value, places the result in a vector, and writes the vector to the destination SIMD&FP register. The results are truncated. For rounded results, see UQRSHL.

A64 Instruction

SQSHLU Vd.8B,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vqshluq_n_s8 (int8x16_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.16B,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqshlu_n_s16 (int16x4_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.4H,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vqshluq_n_s16 (int16x8_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.8H,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqshlu_n_s32 (int32x2_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vqshluq_n_s32 (int32x4_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vqshlu_n_s64 (int64x1_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vqshluq_n_s64 (int64x2_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

uint8_t vqshlub_n_s8 (int8_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Bd,Bn,#n

Argument Preparation

a → Bn 

0 << n << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint16_t vqshluh_n_s16 (int16_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Hd,Hn,#n

Argument Preparation

a → Hn 

0 << n << 15

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint32_t vqshlus_n_s32 (int32_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Sd,Sn,#n

Argument Preparation

a → Sn 

0 << n << 31

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

uint64_t vqshlud_n_s64 (int64_t a, const int n)Signed saturating shift left unsigned

Description

A64 Instruction

SQSHLU Dd,Dn,#n

Argument Preparation

a → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], src_unsigned) << shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, dst_unsigned);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vshrn_n_s16 (int16x8_t a, const int n)Shift right narrow

Description

Shift Right Narrow (immediate). This instruction reads each unsigned integer value from the source SIMD&FP register, right shifts each result by an immediate value, puts the final result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements. The results are truncated. For rounded results, see RSHRN.

A64 Instruction

SHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vshrn_n_s32 (int32x4_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vshrn_n_s64 (int64x2_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vshrn_n_u16 (uint16x8_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vshrn_n_u32 (uint32x4_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vshrn_n_u64 (uint64x2_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vshrn_high_n_s16 (int8x8_t r, int16x8_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vshrn_high_n_s32 (int16x4_t r, int32x4_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vshrn_high_n_s64 (int32x2_t r, int64x2_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vshrn_high_n_u16 (uint8x8_t r, uint16x8_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vshrn_high_n_u32 (uint16x4_t r, uint32x4_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vshrn_high_n_u64 (uint32x2_t r, uint64x2_t a, const int n)Shift right narrow

Description

A64 Instruction

SHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x8_t vqshrun_n_s16 (int16x8_t a, const int n)Signed saturating shift right unsigned narrow

Description

Signed saturating Shift Right Unsigned Narrow (immediate). This instruction reads each signed integer value in the vector of the source SIMD&FP register, right shifts each value by an immediate value, saturates the result to an unsigned integer value that is half the original width, places the final result into a vector, and writes the vector to the destination SIMD&FP register. The results are truncated. For rounded results, see SQRSHRUN.

A64 Instruction

SQSHRUN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqshrun_n_s32 (int32x4_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqshrun_n_s64 (int64x2_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8_t vqshrunh_n_s16 (int16_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqshruns_n_s32 (int32_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqshrund_n_s64 (int64_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqshrun_high_n_s16 (uint8x8_t r, int16x8_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqshrun_high_n_s32 (uint16x4_t r, int32x4_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqshrun_high_n_s64 (uint32x2_t r, int64x2_t a, const int n)Signed saturating shift right unsigned narrow

Description

A64 Instruction

SQSHRUN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x8_t vqrshrun_n_s16 (int16x8_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

Signed saturating Rounded Shift Right Unsigned Narrow (immediate). This instruction reads each signed integer value in the vector of the source SIMD&FP register, right shifts each value by an immediate value, saturates the result to an unsigned integer value that is half the original width, places the final result into a vector, and writes the vector to the destination SIMD&FP register. The results are rounded. For truncated results, see SQSHRUN.

A64 Instruction

SQRSHRUN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqrshrun_n_s32 (int32x4_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqrshrun_n_s64 (int64x2_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8_t vqrshrunh_n_s16 (int16_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqrshruns_n_s32 (int32_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqrshrund_n_s64 (int64_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqrshrun_high_n_s16 (uint8x8_t r, int16x8_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqrshrun_high_n_s32 (uint16x4_t r, int32x4_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqrshrun_high_n_s64 (uint32x2_t r, int64x2_t a, const int n)Signed saturating rounded shift right unsigned narrow

Description

A64 Instruction

SQRSHRUN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (SInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    (Elem[result, e, esize], sat) = UnsignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vqshrn_n_s16 (int16x8_t a, const int n)Signed saturating shift right narrow

Description

Signed saturating Shift Right Narrow (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts and truncates each result by an immediate value, saturates each shifted result to a value that is half the original width, puts the final result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction are signed integer values. The destination vector elements are half as long as the source vector elements. For rounded results, see SQRSHRN.

A64 Instruction

SQSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vqshrn_n_s32 (int32x4_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vqshrn_n_s64 (int64x2_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqshrn_n_u16 (uint16x8_t a, const int n)Unsigned saturating shift right narrow

Description

Unsigned saturating Shift Right Narrow (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, saturates each shifted result to a value that is half the original width, puts the final result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are truncated. For rounded results, see UQRSHRN.

A64 Instruction

UQSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqshrn_n_u32 (uint32x4_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqshrn_n_u64 (uint64x2_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8_t vqshrnh_n_s16 (int16_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16_t vqshrns_n_s32 (int32_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32_t vqshrnd_n_s64 (int64_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8_t vqshrnh_n_u16 (uint16_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqshrns_n_u32 (uint32_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqshrnd_n_u64 (uint64_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int8x16_t vqshrn_high_n_s16 (int8x8_t r, int16x8_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vqshrn_high_n_s32 (int16x4_t r, int32x4_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vqshrn_high_n_s64 (int32x2_t r, int64x2_t a, const int n)Signed saturating shift right narrow

Description

A64 Instruction

SQSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqshrn_high_n_u16 (uint8x8_t r, uint16x8_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqshrn_high_n_u32 (uint16x4_t r, uint32x4_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqshrn_high_n_u64 (uint32x2_t r, uint64x2_t a, const int n)Unsigned saturating shift right narrow

Description

A64 Instruction

UQSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vrshrn_n_s16 (int16x8_t a, const int n)Rounding shift right narrow

Description

Rounding Shift Right Narrow (immediate). This instruction reads each unsigned integer value from the vector in the source SIMD&FP register, right shifts each result by an immediate value, writes the final result to a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements. The results are rounded. For truncated results, see SHRN.

A64 Instruction

RSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vrshrn_n_s32 (int32x4_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vrshrn_n_s64 (int64x2_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrshrn_n_u16 (uint16x8_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrshrn_n_u32 (uint32x4_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrshrn_n_u64 (uint64x2_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vrshrn_high_n_s16 (int8x8_t r, int16x8_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vrshrn_high_n_s32 (int16x4_t r, int32x4_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vrshrn_high_n_s64 (int32x2_t r, int64x2_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vrshrn_high_n_u16 (uint8x8_t r, uint16x8_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vrshrn_high_n_u32 (uint16x4_t r, uint32x4_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vrshrn_high_n_u64 (uint32x2_t r, uint64x2_t a, const int n)Rounding shift right narrow

Description

A64 Instruction

RSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → 32(Vd) 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;

for e = 0 to elements-1
    element = (UInt(Elem[operand, e, 2*esize]) + round_const) >> shift;
    Elem[result, e, esize] = element<esize-1:0>;

Vpart[d, part] = result;

Supported architectures

A64

int8x8_t vqrshrn_n_s16 (int16x8_t a, const int n)Signed saturating rounded shift right narrow

Description

Signed saturating Rounded Shift Right Narrow (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, saturates each shifted result to a value that is half the original width, puts the final result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction are signed integer values. The destination vector elements are half as long as the source vector elements. The results are rounded. For truncated results, see SQSHRN.

A64 Instruction

SQRSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vqrshrn_n_s32 (int32x4_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vqrshrn_n_s64 (int64x2_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqrshrn_n_u16 (uint16x8_t a, const int n)Unsigned saturating rounded shift right narrow

Description

Unsigned saturating Rounded Shift Right Narrow (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each result by an immediate value, puts the final result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. All the values in this instruction are unsigned integer values. The results are rounded. For truncated results, see UQSHRN.

A64 Instruction

UQRSHRN Vd.8B,Vn.8H,#n

Argument Preparation

a → Vn.8H 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqrshrn_n_u32 (uint32x4_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN Vd.4H,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqrshrn_n_u64 (uint64x2_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN Vd.2S,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8_t vqrshrnh_n_s16 (int16_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16_t vqrshrns_n_s32 (int32_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32_t vqrshrnd_n_s64 (int64_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8_t vqrshrnh_n_u16 (uint16_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN Bd,Hn,#n

Argument Preparation

a → Hn 

1 << n << 8

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqrshrns_n_u32 (uint32_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN Hd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 16

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqrshrnd_n_u64 (uint64_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN Sd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int8x16_t vqrshrn_high_n_s16 (int8x8_t r, int16x8_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vqrshrn_high_n_s32 (int16x4_t r, int32x4_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vqrshrn_high_n_s64 (int32x2_t r, int64x2_t a, const int n)Signed saturating rounded shift right narrow

Description

A64 Instruction

SQRSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqrshrn_high_n_u16 (uint8x8_t r, uint16x8_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN2 Vd.16B,Vn.8H,#n

Argument Preparation

r → Vd.8B 

a → Vn.8H 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqrshrn_high_n_u32 (uint16x4_t r, uint32x4_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN2 Vd.8H,Vn.4S,#n

Argument Preparation

r → Vd.4H 

a → Vn.4S 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqrshrn_high_n_u64 (uint32x2_t r, uint64x2_t a, const int n)Unsigned saturating rounded shift right narrow

Description

A64 Instruction

UQRSHRN2 Vd.4S,Vn.2D,#n

Argument Preparation

r → Vd.2S 

a → Vn.2D 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize*2) operand = V[n];
bits(datasize) result;
integer round_const = if round then (1 << (shift - 1)) else 0;
integer element;
boolean sat;

for e = 0 to elements-1
    element = (Int(Elem[operand, e, 2*esize], unsigned) + round_const) >> shift;
    (Elem[result, e, esize], sat) = SatQ(element, esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vshll_n_s8 (int8x8_t a, const int n)Signed shift left long

Description

Signed Shift Left Long (immediate). This instruction reads each vector element from the source SIMD&FP register, left shifts each vector element by the specified shift amount, places the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements. All the values in this instruction are signed integer values.

A64 Instruction

SSHLL Vd.8H,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vshll_n_s16 (int16x4_t a, const int n)Signed shift left long

Description

A64 Instruction

SSHLL Vd.4S,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vshll_n_s32 (int32x2_t a, const int n)Signed shift left long

Description

A64 Instruction

SSHLL Vd.2D,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vshll_n_u8 (uint8x8_t a, const int n)Unsigned shift left long

Description

Unsigned Shift Left Long (immediate). This instruction reads each vector element in the lower or upper half of the source SIMD&FP register, shifts the unsigned integer value left by the specified number of bits, places the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

USHLL Vd.8H,Vn.8B,#n

Argument Preparation

a → Vn.8B 

0 << n << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vshll_n_u16 (uint16x4_t a, const int n)Unsigned shift left long

Description

A64 Instruction

USHLL Vd.4S,Vn.4H,#n

Argument Preparation

a → Vn.4H 

0 << n << 15

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vshll_n_u32 (uint32x2_t a, const int n)Unsigned shift left long

Description

A64 Instruction

USHLL Vd.2D,Vn.2S,#n

Argument Preparation

a → Vn.2S 

0 << n << 31

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vshll_high_n_s8 (int8x16_t a, const int n)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.8H,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vshll_high_n_s16 (int16x8_t a, const int n)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.4S,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vshll_high_n_s32 (int32x4_t a, const int n)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.2D,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vshll_high_n_u8 (uint8x16_t a, const int n)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.8H,Vn.16B,#n

Argument Preparation

a → Vn.16B 

0 << n << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vshll_high_n_u16 (uint16x8_t a, const int n)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.4S,Vn.8H,#n

Argument Preparation

a → Vn.8H 

0 << n << 15

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vshll_high_n_u32 (uint32x4_t a, const int n)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.2D,Vn.4S,#n

Argument Preparation

a → Vn.4S 

0 << n << 31

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int16x8_t vshll_n_s8 (int8x8_t a, const int n)Shift left long

Description

Shift Left Long (by element size). This instruction reads each vector element in the lower or upper half of the source SIMD&FP register, left shifts each result by the element size, writes the final result to a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SHLL Vd.8H,Vn.8B,#n

Argument Preparation

a → Vn.8B 

8 << n << 8

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vshll_n_s16 (int16x4_t a, const int n)Shift left long

Description

A64 Instruction

SHLL Vd.4S,Vn.4H,#n

Argument Preparation

a → Vn.4H 

16 << n << 16

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vshll_n_s32 (int32x2_t a, const int n)Shift left long

Description

A64 Instruction

SHLL Vd.2D,Vn.2S,#n

Argument Preparation

a → Vn.2S 

32 << n << 32

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vshll_n_u8 (uint8x8_t a, const int n)Shift left long

Description

A64 Instruction

SHLL Vd.8H,Vn.8B,#n

Argument Preparation

a → Vn.8B 

8 << n << 8

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vshll_n_u16 (uint16x4_t a, const int n)Shift left long

Description

A64 Instruction

SHLL Vd.4S,Vn.4H,#n

Argument Preparation

a → Vn.4H 

16 << n << 16

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vshll_n_u32 (uint32x2_t a, const int n)Shift left long

Description

A64 Instruction

SHLL Vd.2D,Vn.2S,#n

Argument Preparation

a → Vn.2S 

32 << n << 32

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vshll_high_n_s8 (int8x16_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.8H,Vn.16B,#n

Argument Preparation

a → Vn.16B 

8 << n << 8

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vshll_high_n_s16 (int16x8_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.4S,Vn.8H,#n

Argument Preparation

a → Vn.8H 

16 << n << 16

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vshll_high_n_s32 (int32x4_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.2D,Vn.4S,#n

Argument Preparation

a → Vn.4S 

32 << n << 32

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vshll_high_n_u8 (uint8x16_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.8H,Vn.16B,#n

Argument Preparation

a → Vn.16B 

8 << n << 8

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vshll_high_n_u16 (uint16x8_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.4S,Vn.8H,#n

Argument Preparation

a → Vn.8H 

16 << n << 16

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vshll_high_n_u32 (uint32x4_t a, const int n)Shift left long

Description

A64 Instruction

SHLL2 Vd.2D,Vn.4S,#n

Argument Preparation

a → Vn.4S 

32 << n << 32

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vsri_n_s8 (int8x8_t a, int8x8_t b, const int n)Shift right and insert

Description

Shift Right and Insert (immediate). This instruction reads each vector element in the source SIMD&FP register, right shifts each vector element by an immediate value, and inserts the result into the corresponding vector element in the destination SIMD&FP register such that the new zero bits created by the shift are not inserted but retain their existing value. Bits shifted out of the right of each vector element of the source register are lost.

A64 Instruction

SRI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vsriq_n_s8 (int8x16_t a, int8x16_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vsri_n_s16 (int16x4_t a, int16x4_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsriq_n_s16 (int16x8_t a, int16x8_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vsri_n_s32 (int32x2_t a, int32x2_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsriq_n_s32 (int32x4_t a, int32x4_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vsri_n_s64 (int64x1_t a, int64x1_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsriq_n_s64 (int64x2_t a, int64x2_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vsri_n_u8 (uint8x8_t a, uint8x8_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vsriq_n_u8 (uint8x16_t a, uint8x16_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vsri_n_u16 (uint16x4_t a, uint16x4_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsriq_n_u16 (uint16x8_t a, uint16x8_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vsri_n_u32 (uint32x2_t a, uint32x2_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsriq_n_u32 (uint32x4_t a, uint32x4_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vsri_n_u64 (uint64x1_t a, uint64x1_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsriq_n_u64 (uint64x2_t a, uint64x2_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vsri_n_p64 (poly64x1_t a, poly64x1_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vsriq_n_p64 (poly64x2_t a, poly64x2_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A32/A64

poly8x8_t vsri_n_p8 (poly8x8_t a, poly8x8_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

1 << n << 8

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vsriq_n_p8 (poly8x16_t a, poly8x16_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

1 << n << 8

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vsri_n_p16 (poly16x4_t a, poly16x4_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

1 << n << 16

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vsriq_n_p16 (poly16x8_t a, poly16x8_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

1 << n << 16

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64_t vsrid_n_s64 (int64_t a, int64_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A64

uint64_t vsrid_n_u64 (uint64_t a, uint64_t b, const int n)Shift right and insert

Description

A64 Instruction

SRI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSR(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSR(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A64

int8x8_t vsli_n_s8 (int8x8_t a, int8x8_t b, const int n)Shift left and insert

Description

Shift Left and Insert (immediate). This instruction reads each vector element in the source SIMD&FP register, left shifts each vector element by an immediate value, and inserts the result into the corresponding vector element in the destination SIMD&FP register such that the new zero bits created by the shift are not inserted but retain their existing value. Bits shifted out of the left of each vector element in the source register are lost.

A64 Instruction

SLI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vsliq_n_s8 (int8x16_t a, int8x16_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vsli_n_s16 (int16x4_t a, int16x4_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsliq_n_s16 (int16x8_t a, int16x8_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vsli_n_s32 (int32x2_t a, int32x2_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsliq_n_s32 (int32x4_t a, int32x4_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vsli_n_s64 (int64x1_t a, int64x1_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsliq_n_s64 (int64x2_t a, int64x2_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vsli_n_u8 (uint8x8_t a, uint8x8_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vsliq_n_u8 (uint8x16_t a, uint8x16_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vsli_n_u16 (uint16x4_t a, uint16x4_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsliq_n_u16 (uint16x8_t a, uint16x8_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vsli_n_u32 (uint32x2_t a, uint32x2_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.2S,Vn.2S,#n

Argument Preparation

a → Vd.2S 

b → Vn.2S 

0 << n << 31

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsliq_n_u32 (uint32x4_t a, uint32x4_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.4S,Vn.4S,#n

Argument Preparation

a → Vd.4S 

b → Vn.4S 

0 << n << 31

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vsli_n_u64 (uint64x1_t a, uint64x1_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsliq_n_u64 (uint64x2_t a, uint64x2_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vsli_n_p64 (poly64x1_t a, poly64x1_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vsliq_n_p64 (poly64x2_t a, poly64x2_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.2D,Vn.2D,#n

Argument Preparation

a → Vd.2D 

b → Vn.2D 

0 << n << 63

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A32/A64

poly8x8_t vsli_n_p8 (poly8x8_t a, poly8x8_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.8B,Vn.8B,#n

Argument Preparation

a → Vd.8B 

b → Vn.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vsliq_n_p8 (poly8x16_t a, poly8x16_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.16B,Vn.16B,#n

Argument Preparation

a → Vd.16B 

b → Vn.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vsli_n_p16 (poly16x4_t a, poly16x4_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.4H,Vn.4H,#n

Argument Preparation

a → Vd.4H 

b → Vn.4H 

0 << n << 15

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vsliq_n_p16 (poly16x8_t a, poly16x8_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Vd.8H,Vn.8H,#n

Argument Preparation

a → Vd.8H 

b → Vn.8H 

0 << n << 15

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

v7/A32/A64

int64_t vslid_n_s64 (int64_t a, int64_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A64

uint64_t vslid_n_u64 (uint64_t a, uint64_t b, const int n)Shift left and insert

Description

A64 Instruction

SLI Dd,Dn,#n

Argument Preparation

a → Dd 

b → Dn 

0 << n << 63

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) operand2 = V[d];
bits(datasize) result;
bits(esize) mask = LSL(Ones(esize), shift);
bits(esize) shifted;

for e = 0 to elements-1
    shifted = LSL(Elem[operand, e, esize], shift);
    Elem[result, e, esize] = (Elem[operand2, e, esize] AND NOT(mask)) OR shifted;
V[d] = result;

Supported architectures

A64

int32x2_t vcvt_s32_f32 (float32x2_t a)Floating-point convert to signed integer, rounding toward zero

Description

Floating-point Convert to Signed integer, rounding toward Zero (vector). This instruction converts a scalar or each element in a vector from a floating-point value to a signed integer value using the Round towards Zero rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTZS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vcvtq_s32_f32 (float32x4_t a)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcvt_u32_f32 (float32x2_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

Floating-point Convert to Unsigned integer, rounding toward Zero (vector). This instruction converts a scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round towards Zero rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTZU Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcvtq_u32_f32 (float32x4_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vcvtn_s32_f32 (float32x2_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

Floating-point Convert to Signed integer, rounding to nearest with ties to even (vector). This instruction converts a scalar or each element in a vector from a floating-point value to a signed integer value using the Round to Nearest rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTNS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x4_t vcvtnq_s32_f32 (float32x4_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x2_t vcvtn_u32_f32 (float32x2_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

Floating-point Convert to Unsigned integer, rounding to nearest with ties to even (vector). This instruction converts a scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round to Nearest rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTNU Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x4_t vcvtnq_u32_f32 (float32x4_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNU Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x2_t vcvtm_s32_f32 (float32x2_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

Floating-point Convert to Signed integer, rounding toward Minus infinity (vector). This instruction converts a scalar or each element in a vector from a floating-point value to a signed integer value using the Round towards Minus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTMS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x4_t vcvtmq_s32_f32 (float32x4_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x2_t vcvtm_u32_f32 (float32x2_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

Floating-point Convert to Unsigned integer, rounding toward Minus infinity (vector). This instruction converts a scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round towards Minus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTMU Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x4_t vcvtmq_u32_f32 (float32x4_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMU Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x2_t vcvtp_s32_f32 (float32x2_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

Floating-point Convert to Signed integer, rounding toward Plus infinity (vector). This instruction converts a scalar or each element in a vector from a floating-point value to a signed integer value using the Round towards Plus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTPS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x4_t vcvtpq_s32_f32 (float32x4_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x2_t vcvtp_u32_f32 (float32x2_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

Floating-point Convert to Unsigned integer, rounding toward Plus infinity (vector). This instruction converts a scalar or each element in a vector from a floating-point value to an unsigned integer value using the Round towards Plus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTPU Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x4_t vcvtpq_u32_f32 (float32x4_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPU Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x2_t vcvta_s32_f32 (float32x2_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

Floating-point Convert to Signed integer, rounding to nearest with ties to Away (vector). This instruction converts each element in a vector from a floating-point value to a signed integer value using the Round to Nearest with Ties to Away rounding mode and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTAS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32x4_t vcvtaq_s32_f32 (float32x4_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x2_t vcvta_u32_f32 (float32x2_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

Floating-point Convert to Unsigned integer, rounding to nearest with ties to Away (vector). This instruction converts each element in a vector from a floating-point value to an unsigned integer value using the Round to Nearest with Ties to Away rounding mode and writes the result to the SIMD&FP destination register.

A64 Instruction

FCVTAU Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

uint32x4_t vcvtaq_u32_f32 (float32x4_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAU Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A32/A64

int32_t vcvts_s32_f32 (float32_t a)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvts_u32_f32 (float32_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int32_t vcvtns_s32_f32 (float32_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvtns_u32_f32 (float32_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNU Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int32_t vcvtms_s32_f32 (float32_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvtms_u32_f32 (float32_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMU Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int32_t vcvtps_s32_f32 (float32_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvtps_u32_f32 (float32_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPU Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int32_t vcvtas_s32_f32 (float32_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvtas_u32_f32 (float32_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAU Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvt_s64_f64 (float64x1_t a)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtq_s64_f64 (float64x2_t a)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvt_u64_f64 (float64x1_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtq_u64_f64 (float64x2_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvtn_s64_f64 (float64x1_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtnq_s64_f64 (float64x2_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvtn_u64_f64 (float64x1_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtnq_u64_f64 (float64x2_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNU Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvtm_s64_f64 (float64x1_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtmq_s64_f64 (float64x2_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvtm_u64_f64 (float64x1_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtmq_u64_f64 (float64x2_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMU Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvtp_s64_f64 (float64x1_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtpq_s64_f64 (float64x2_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvtp_u64_f64 (float64x1_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtpq_u64_f64 (float64x2_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPU Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvta_s64_f64 (float64x1_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtaq_s64_f64 (float64x2_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvta_u64_f64 (float64x1_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtaq_u64_f64 (float64x2_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAU Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtd_s64_f64 (float64_t a)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtd_u64_f64 (float64_t a)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtnd_s64_f64 (float64_t a)Floating-point convert to signed integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtnd_u64_f64 (float64_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to even

Description

A64 Instruction

FCVTNU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtmd_s64_f64 (float64_t a)Floating-point convert to signed integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtmd_u64_f64 (float64_t a)Floating-point convert to unsigned integer, rounding toward minus infinity

Description

A64 Instruction

FCVTMU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtpd_s64_f64 (float64_t a)Floating-point convert to signed integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtpd_u64_f64 (float64_t a)Floating-point convert to unsigned integer, rounding toward plus infinity

Description

A64 Instruction

FCVTPU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtad_s64_f64 (float64_t a)Floating-point convert to signed integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtad_u64_f64 (float64_t a)Floating-point convert to unsigned integer, rounding to nearest with ties to away

Description

A64 Instruction

FCVTAU Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int32x2_t vcvt_n_s32_f32 (float32x2_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vcvtq_n_s32_f32 (float32x4_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcvt_n_u32_f32 (float32x2_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcvtq_n_u32_f32 (float32x4_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vcvts_n_s32_f32 (float32_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Sd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint32_t vcvts_n_u32_f32 (float32_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Sd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x1_t vcvt_n_s64_f64 (float64x1_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64x2_t vcvtq_n_s64_f64 (float64x2_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x1_t vcvt_n_u64_f64 (float64x1_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64x2_t vcvtq_n_u64_f64 (float64x2_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

int64_t vcvtd_n_s64_f64 (float64_t a, const int n)Floating-point convert to signed integer, rounding toward zero

Description

A64 Instruction

FCVTZS Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

uint64_t vcvtd_n_u64_f64 (float64_t a, const int n)Floating-point convert to unsigned integer, rounding toward zero

Description

A64 Instruction

FCVTZU Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPToFixed(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float32x2_t vcvt_f32_s32 (int32x2_t a)Signed integer convert to floating-point

Description

Signed integer Convert to Floating-point (vector). This instruction converts each element in a vector from signed integer to floating-point using the rounding mode that is specified by the FPCR, and writes the result to the SIMD&FP destination register.

A64 Instruction

SCVTF Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcvtq_f32_s32 (int32x4_t a)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vcvt_f32_u32 (uint32x2_t a)Unsigned integer convert to floating-point

Description

Unsigned integer Convert to Floating-point (vector). This instruction converts each element in a vector from an unsigned integer value to a floating-point value using the rounding mode that is specified by the FPCR, and writes the result to the SIMD&FP destination register.

A64 Instruction

UCVTF Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcvtq_f32_u32 (uint32x4_t a)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32_t vcvts_f32_s32 (int32_t a)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float32_t vcvts_f32_u32 (uint32_t a)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x1_t vcvt_f64_s64 (int64x1_t a)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x2_t vcvtq_f64_s64 (int64x2_t a)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x1_t vcvt_f64_u64 (uint64x1_t a)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x2_t vcvtq_f64_u64 (uint64x2_t a)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64_t vcvtd_f64_s64 (int64_t a)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64_t vcvtd_f64_u64 (uint64_t a)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float32x2_t vcvt_n_f32_s32 (int32x2_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcvtq_n_f32_s32 (int32x4_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vcvt_n_f32_u32 (uint32x2_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Vd.2S,Vn.2S,#n

Argument Preparation

a → Vn.2S 

1 << n << 32

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcvtq_n_f32_u32 (uint32x4_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Vd.4S,Vn.4S,#n

Argument Preparation

a → Vn.4S 

1 << n << 32

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

v7/A32/A64

float32_t vcvts_n_f32_s32 (int32_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Sd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float32_t vcvts_n_f32_u32 (uint32_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Sd,Sn,#n

Argument Preparation

a → Sn 

1 << n << 32

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x1_t vcvt_n_f64_s64 (int64x1_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x2_t vcvtq_n_f64_s64 (int64x2_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x1_t vcvt_n_f64_u64 (uint64x1_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64x2_t vcvtq_n_f64_u64 (uint64x2_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Vd.2D,Vn.2D,#n

Argument Preparation

a → Vn.2D 

1 << n << 64

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64_t vcvtd_n_f64_s64 (int64_t a, const int n)Signed integer convert to floating-point

Description

A64 Instruction

SCVTF Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float64_t vcvtd_n_f64_u64 (uint64_t a, const int n)Unsigned integer convert to floating-point

Description

A64 Instruction

UCVTF Dd,Dn,#n

Argument Preparation

a → Dn 

1 << n << 64

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
FPRounding rounding = FPRoundingMode(FPCR);
bits(esize) element;
for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FixedToFP(element, 0, unsigned, FPCR, rounding);

V[d] = result;

Supported architectures

A64

float16x4_t vcvt_f16_f32 (float32x4_t a)Floating-point convert to lower precision narrow

Description

Floating-point Convert to lower precision Narrow (vector). This instruction reads each vector element in the SIMD&FP source register, converts each result to half the precision of the source element, writes the final result to a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements. The rounding mode is determined by the FPCR.

A64 Instruction

FCVTN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR);

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

float16x8_t vcvt_high_f16_f32 (float16x4_t r, float32x4_t a)Floating-point convert to lower precision narrow

Description

A64 Instruction

FCVTN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR);

Vpart[d, part] = result;

Supported architectures

A64

float32x2_t vcvt_f32_f64 (float64x2_t a)Floating-point convert to lower precision narrow

Description

A64 Instruction

FCVTN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR);

Vpart[d, part] = result;

Supported architectures

A64

float32x4_t vcvt_high_f32_f64 (float32x2_t r, float64x2_t a)Floating-point convert to lower precision narrow

Description

A64 Instruction

FCVTN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR);

Vpart[d, part] = result;

Supported architectures

A64

float32x4_t vcvt_f32_f16 (float16x4_t a)Floating-point convert to higher precision long

Description

Floating-point Convert to higher precision Long (vector). This instruction reads each element in a vector in the SIMD&FP source register, converts each value to double the precision of the source element using the rounding mode that is determined by the FPCR, and writes each result to the equivalent element of the vector in the SIMD&FP destination register.

A64 Instruction

FCVTL Vd.4S,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;

for e = 0 to elements-1
    Elem[result, e, 2*esize] = FPConvert(Elem[operand, e, esize], FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcvt_high_f32_f16 (float16x8_t a)Floating-point convert to higher precision long

Description

A64 Instruction

FCVTL2 Vd.4S,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;

for e = 0 to elements-1
    Elem[result, e, 2*esize] = FPConvert(Elem[operand, e, esize], FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vcvt_f64_f32 (float32x2_t a)Floating-point convert to higher precision long

Description

A64 Instruction

FCVTL Vd.2D,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;

for e = 0 to elements-1
    Elem[result, e, 2*esize] = FPConvert(Elem[operand, e, esize], FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vcvt_high_f64_f32 (float32x4_t a)Floating-point convert to higher precision long

Description

A64 Instruction

FCVTL2 Vd.2D,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(2*datasize) result;

for e = 0 to elements-1
    Elem[result, e, 2*esize] = FPConvert(Elem[operand, e, esize], FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vcvtx_f32_f64 (float64x2_t a)Floating-point convert to lower precision narrow, rounding to odd

Description

Floating-point Convert to lower precision Narrow, rounding to odd (vector). This instruction reads each vector element in the source SIMD&FP register, narrows each value to half the precision of the source element using the Round to Odd rounding mode, writes the result to a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FCVTXN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR, FPRounding_ODD);

Vpart[d, part] = result;

Supported architectures

A64

float32_t vcvtxd_f32_f64 (float64_t a)Floating-point convert to lower precision narrow, rounding to odd

Description

A64 Instruction

FCVTXN Sd,Dn

Argument Preparation

a → Dn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR, FPRounding_ODD);

Vpart[d, part] = result;

Supported architectures

A64

float32x4_t vcvtx_high_f32_f64 (float32x2_t r, float64x2_t a)Floating-point convert to lower precision narrow, rounding to odd

Description

A64 Instruction

FCVTXN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = FPConvert(Elem[operand, e, 2*esize], FPCR, FPRounding_ODD);

Vpart[d, part] = result;

Supported architectures

A64

float32x2_t vrnd_f32 (float32x2_t a)Floating-point round to integral, toward zero

Description

Floating-point Round to Integral, toward Zero (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the Round towards Zero rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTZ Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndq_f32 (float32x4_t a)Floating-point round to integral, toward zero

Description

A64 Instruction

FRINTZ Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrnd_f64 (float64x1_t a)Floating-point round to integral, toward zero

Description

A64 Instruction

FRINTZ Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndq_f64 (float64x2_t a)Floating-point round to integral, toward zero

Description

A64 Instruction

FRINTZ Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float32x2_t vrndn_f32 (float32x2_t a)Floating-point round to integral, to nearest with ties to even

Description

Floating-point Round to Integral, to nearest with ties to even (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the Round to Nearest rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTN Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndnq_f32 (float32x4_t a)Floating-point round to integral, to nearest with ties to even

Description

A64 Instruction

FRINTN Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrndn_f64 (float64x1_t a)Floating-point round to integral, to nearest with ties to even

Description

A64 Instruction

FRINTN Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x2_t vrndnq_f64 (float64x2_t a)Floating-point round to integral, to nearest with ties to even

Description

A64 Instruction

FRINTN Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32_t vrndns_f32 (float32_t a)Floating-point round to integral, to nearest with ties to even

Description

A64 Instruction

FRINTN Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x2_t vrndm_f32 (float32x2_t a)Floating-point round to integral, toward minus infinity

Description

Floating-point Round to Integral, toward Minus infinity (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the Round towards Minus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTM Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndmq_f32 (float32x4_t a)Floating-point round to integral, toward minus infinity

Description

A64 Instruction

FRINTM Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrndm_f64 (float64x1_t a)Floating-point round to integral, toward minus infinity

Description

A64 Instruction

FRINTM Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndmq_f64 (float64x2_t a)Floating-point round to integral, toward minus infinity

Description

A64 Instruction

FRINTM Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float32x2_t vrndp_f32 (float32x2_t a)Floating-point round to integral, toward plus infinity

Description

Floating-point Round to Integral, toward Plus infinity (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the Round towards Plus Infinity rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTP Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndpq_f32 (float32x4_t a)Floating-point round to integral, toward plus infinity

Description

A64 Instruction

FRINTP Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrndp_f64 (float64x1_t a)Floating-point round to integral, toward plus infinity

Description

A64 Instruction

FRINTP Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndpq_f64 (float64x2_t a)Floating-point round to integral, toward plus infinity

Description

A64 Instruction

FRINTP Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float32x2_t vrnda_f32 (float32x2_t a)Floating-point round to integral, to nearest with ties to away

Description

Floating-point Round to Integral, to nearest with ties to Away (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the Round to Nearest with Ties to Away rounding mode, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTA Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndaq_f32 (float32x4_t a)Floating-point round to integral, to nearest with ties to away

Description

A64 Instruction

FRINTA Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrnda_f64 (float64x1_t a)Floating-point round to integral, to nearest with ties to away

Description

A64 Instruction

FRINTA Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndaq_f64 (float64x2_t a)Floating-point round to integral, to nearest with ties to away

Description

A64 Instruction

FRINTA Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float32x2_t vrndi_f32 (float32x2_t a)Floating-point round to integral, using current rounding mode

Description

Floating-point Round to Integral, using current rounding mode (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the rounding mode that is determined by the FPCR, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTI Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndiq_f32 (float32x4_t a)Floating-point round to integral, using current rounding mode

Description

A64 Instruction

FRINTI Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrndi_f64 (float64x1_t a)Floating-point round to integral, using current rounding mode

Description

A64 Instruction

FRINTI Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndiq_f64 (float64x2_t a)Floating-point round to integral, using current rounding mode

Description

A64 Instruction

FRINTI Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float32x2_t vrndx_f32 (float32x2_t a)Floating-point round to integral exact, using current rounding mode

Description

Floating-point Round to Integral exact, using current rounding mode (vector). This instruction rounds a vector of floating-point values in the SIMD&FP source register to integral floating-point values of the same size using the rounding mode that is determined by the FPCR, and writes the result to the SIMD&FP destination register.

A64 Instruction

FRINTX Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float32x4_t vrndxq_f32 (float32x4_t a)Floating-point round to integral exact, using current rounding mode

Description

A64 Instruction

FRINTX Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A32/A64

float64x1_t vrndx_f64 (float64x1_t a)Floating-point round to integral exact, using current rounding mode

Description

A64 Instruction

FRINTX Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

float64x2_t vrndxq_f64 (float64x2_t a)Floating-point round to integral exact, using current rounding mode

Description

A64 Instruction

FRINTX Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRoundInt(element, FPCR, rounding, exact);

V[d] = result;

Supported architectures

A64

int8x8_t vmovn_s16 (int16x8_t a)Extract narrow

Description

Extract Narrow. This instruction reads each vector element from the source SIMD&FP register, narrows each value to half the original width, places the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements.

A64 Instruction

XTN Vd.8B,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vmovn_s32 (int32x4_t a)Extract narrow

Description

A64 Instruction

XTN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vmovn_s64 (int64x2_t a)Extract narrow

Description

A64 Instruction

XTN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmovn_u16 (uint16x8_t a)Extract narrow

Description

A64 Instruction

XTN Vd.8B,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmovn_u32 (uint32x4_t a)Extract narrow

Description

A64 Instruction

XTN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmovn_u64 (uint64x2_t a)Extract narrow

Description

A64 Instruction

XTN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8x16_t vmovn_high_s16 (int8x8_t r, int16x8_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.16B,Vn.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vmovn_high_s32 (int16x4_t r, int32x4_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vmovn_high_s64 (int32x2_t r, int64x2_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vmovn_high_u16 (uint8x8_t r, uint16x8_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.16B,Vn.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vmovn_high_u32 (uint16x4_t r, uint32x4_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vmovn_high_u64 (uint32x2_t r, uint64x2_t a)Extract narrow

Description

A64 Instruction

XTN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    Elem[result, e, esize] = element<esize-1:0>;
Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vmovl_s8 (int8x8_t a)Signed shift left long

Description

A64 Instruction

SSHLL Vd.8H,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmovl_s16 (int16x4_t a)Signed shift left long

Description

A64 Instruction

SSHLL Vd.4S,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmovl_s32 (int32x2_t a)Signed shift left long

Description

A64 Instruction

SSHLL Vd.2D,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmovl_u8 (uint8x8_t a)Unsigned shift left long

Description

A64 Instruction

USHLL Vd.8H,Vn.8B,#0

Argument Preparation

a → Vn.8B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmovl_u16 (uint16x4_t a)Unsigned shift left long

Description

A64 Instruction

USHLL Vd.4S,Vn.4H,#0

Argument Preparation

a → Vn.4H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmovl_u32 (uint32x2_t a)Unsigned shift left long

Description

A64 Instruction

USHLL Vd.2D,Vn.2S,#0

Argument Preparation

a → Vn.2S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmovl_high_s8 (int8x16_t a)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.8H,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vmovl_high_s16 (int16x8_t a)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.4S,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmovl_high_s32 (int32x4_t a)Signed shift left long

Description

A64 Instruction

SSHLL2 Vd.2D,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vmovl_high_u8 (uint8x16_t a)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.8H,Vn.16B,#0

Argument Preparation

a → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmovl_high_u16 (uint16x8_t a)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.4S,Vn.8H,#0

Argument Preparation

a → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmovl_high_u32 (uint32x4_t a)Unsigned shift left long

Description

A64 Instruction

USHLL2 Vd.2D,Vn.4S,#0

Argument Preparation

a → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = Vpart[n, part];
bits(datasize*2) result;
integer element;

for e = 0 to elements-1
    element = Int(Elem[operand, e, esize], unsigned) << shift;
    Elem[result, e, 2*esize] = element<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int8x8_t vqmovn_s16 (int16x8_t a)Signed saturating extract narrow

Description

Signed saturating extract Narrow. This instruction reads each vector element from the source SIMD&FP register, saturates the value to half the original width, places the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements. All the values in this instruction are signed integer values.

A64 Instruction

SQXTN Vd.8B,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int16x4_t vqmovn_s32 (int32x4_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int32x2_t vqmovn_s64 (int64x2_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8x8_t vqmovn_u16 (uint16x8_t a)Unsigned saturating extract narrow

Description

Unsigned saturating extract Narrow. This instruction reads each vector element from the source SIMD&FP register, saturates each value to half the original width, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values.

A64 Instruction

UQXTN Vd.8B,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqmovn_u32 (uint32x4_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqmovn_u64 (uint64x2_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

int8_t vqmovnh_s16 (int16_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN Bd,Hn

Argument Preparation

a → Hn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16_t vqmovns_s32 (int32_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN Hd,Sn

Argument Preparation

a → Sn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32_t vqmovnd_s64 (int64_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN Sd,Dn

Argument Preparation

a → Dn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8_t vqmovnh_u16 (uint16_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN Bd,Hn

Argument Preparation

a → Hn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqmovns_u32 (uint32_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN Hd,Sn

Argument Preparation

a → Sn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqmovnd_u64 (uint64_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN Sd,Dn

Argument Preparation

a → Dn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int8x16_t vqmovn_high_s16 (int8x8_t r, int16x8_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN2 Vd.16B,Vn.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16x8_t vqmovn_high_s32 (int16x4_t r, int32x4_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int32x4_t vqmovn_high_s64 (int32x2_t r, int64x2_t a)Signed saturating extract narrow

Description

A64 Instruction

SQXTN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqmovn_high_u16 (uint8x8_t r, uint16x8_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN2 Vd.16B,Vn.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqmovn_high_u32 (uint16x4_t r, uint32x4_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqmovn_high_u64 (uint32x2_t r, uint64x2_t a)Unsigned saturating extract narrow

Description

A64 Instruction

UQXTN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = SatQ(Int(element, unsigned), esize, unsigned);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x8_t vqmovun_s16 (int16x8_t a)Signed saturating extract unsigned narrow

Description

Signed saturating extract Unsigned Narrow. This instruction reads each signed integer value in the vector of the source SIMD&FP register, saturates the value to an unsigned integer value that is half the original width, places the result into a vector, and writes the vector to the lower or upper half of the destination SIMD&FP register. The destination vector elements are half as long as the source vector elements.

A64 Instruction

SQXTUN Vd.8B,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint16x4_t vqmovun_s32 (int32x4_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN Vd.4H,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint32x2_t vqmovun_s64 (int64x2_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN Vd.2S,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

v7/A32/A64

uint8_t vqmovunh_s16 (int16_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN Bd,Hn

Argument Preparation

a → Hn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16_t vqmovuns_s32 (int32_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN Hd,Sn

Argument Preparation

a → Sn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32_t vqmovund_s64 (int64_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN Sd,Dn

Argument Preparation

a → Dn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint8x16_t vqmovun_high_s16 (uint8x8_t r, int16x8_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN2 Vd.16B,Vn.8H

Argument Preparation

r → Vd.8B 

a → Vn.8H

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint16x8_t vqmovun_high_s32 (uint16x4_t r, int32x4_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN2 Vd.8H,Vn.4S

Argument Preparation

r → Vd.4H 

a → Vn.4S

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

uint32x4_t vqmovun_high_s64 (uint32x2_t r, int64x2_t a)Signed saturating extract unsigned narrow

Description

A64 Instruction

SQXTUN2 Vd.4S,Vn.2D

Argument Preparation

r → Vd.2S 

a → Vn.2D

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(2*datasize) operand = V[n];
bits(datasize) result;
bits(2*esize) element;
boolean sat;

for e = 0 to elements-1
    element = Elem[operand, e, 2*esize];
    (Elem[result, e, esize], sat) = UnsignedSatQ(SInt(element), esize);
    if sat then FPSR.QC = '1';

Vpart[d, part] = result;

Supported architectures

A64

int16x4_t vmla_lane_s16 (int16x4_t a, int16x4_t b, int16x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlaq_lane_s16 (int16x8_t a, int16x8_t b, int16x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmla_lane_s32 (int32x2_t a, int32x2_t b, int32x2_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlaq_lane_s32 (int32x4_t a, int32x4_t b, int32x2_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmla_lane_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlaq_lane_u16 (uint16x8_t a, uint16x8_t b, uint16x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmla_lane_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlaq_lane_u32 (uint32x4_t a, uint32x4_t b, uint32x2_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmla_lane_f32 (float32x2_t a, float32x2_t b, float32x2_t v, const int lane)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * v[lane]) for i = 0 to 1

Argument Preparation

0 << lane << 1

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlaq_lane_f32 (float32x4_t a, float32x4_t b, float32x2_t v, const int lane)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * v[lane]) for i = 0 to 3

Argument Preparation

0 << lane << 1

Results

N/A → result

Supported architectures

v7/A32/A64

int16x4_t vmla_laneq_s16 (int16x4_t a, int16x4_t b, int16x8_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int16x8_t vmlaq_laneq_s16 (int16x8_t a, int16x8_t b, int16x8_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int32x2_t vmla_laneq_s32 (int32x2_t a, int32x2_t b, int32x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int32x4_t vmlaq_laneq_s32 (int32x4_t a, int32x4_t b, int32x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint16x4_t vmla_laneq_u16 (uint16x4_t a, uint16x4_t b, uint16x8_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint16x8_t vmlaq_laneq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint32x2_t vmla_laneq_u32 (uint32x2_t a, uint32x2_t b, uint32x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlaq_laneq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t v, const int lane)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

float32x2_t vmla_laneq_f32 (float32x2_t a, float32x2_t b, float32x4_t v, const int lane)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * v[lane]) for i = 0 to 1

Argument Preparation

0 << lane << 3

Results

N/A → result

Supported architectures

A64

float32x4_t vmlaq_laneq_f32 (float32x4_t a, float32x4_t b, float32x4_t v, const int lane)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * v[lane]) for i = 0 to 3

Argument Preparation

0 << lane << 3

Results

N/A → result

Supported architectures

A64

int32x4_t vmlal_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlal_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlal_lane_u16 (uint32x4_t a, uint16x4_t b, uint16x4_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlal_lane_u32 (uint64x2_t a, uint32x2_t b, uint32x2_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlal_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlal_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlal_high_lane_u16 (uint32x4_t a, uint16x8_t b, uint16x4_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlal_high_lane_u32 (uint64x2_t a, uint32x4_t b, uint32x2_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlal_laneq_s16 (int32x4_t a, int16x4_t b, int16x8_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlal_laneq_s32 (int64x2_t a, int32x2_t b, int32x4_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlal_laneq_u16 (uint32x4_t a, uint16x4_t b, uint16x8_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlal_laneq_u32 (uint64x2_t a, uint32x2_t b, uint32x4_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlal_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlal_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t v, const int lane)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlal_high_laneq_u16 (uint32x4_t a, uint16x8_t b, uint16x8_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlal_high_laneq_u32 (uint64x2_t a, uint32x4_t b, uint32x4_t v, const int lane)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlal_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmlalh_lane_s16 (int32_t a, int16_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Sd 

b → Hn 

v → Vm.4H 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlals_lane_s32 (int64_t a, int32_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Dd 

b → Sn 

v → Vm.2S 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlal_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_laneq_s16 (int32x4_t a, int16x4_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlal_laneq_s32 (int64x2_t a, int32x2_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmlalh_laneq_s16 (int32_t a, int16_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Sd 

b → Hn 

v → Vm.8H 

0 << lane << 7

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlals_laneq_s32 (int64_t a, int32_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Dd 

b → Sn 

v → Vm.4S 

0 << lane << 3

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlal_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vmls_lane_s16 (int16x4_t a, int16x4_t b, int16x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlsq_lane_s16 (int16x8_t a, int16x8_t b, int16x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmls_lane_s32 (int32x2_t a, int32x2_t b, int32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsq_lane_s32 (int32x4_t a, int32x4_t b, int32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmls_lane_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlsq_lane_u16 (uint16x8_t a, uint16x8_t b, uint16x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmls_lane_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsq_lane_u32 (uint32x4_t a, uint32x4_t b, uint32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmls_lane_f32 (float32x2_t a, float32x2_t b, float32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * v[lane]) for i = 0 to 1

Argument Preparation

0 << lane << 1

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlsq_lane_f32 (float32x4_t a, float32x4_t b, float32x2_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * v[lane]) for i = 0 to 3

Argument Preparation

0 << lane << 1

Results

N/A → result

Supported architectures

v7/A32/A64

int16x4_t vmls_laneq_s16 (int16x4_t a, int16x4_t b, int16x8_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int16x8_t vmlsq_laneq_s16 (int16x8_t a, int16x8_t b, int16x8_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int32x2_t vmls_laneq_s32 (int32x2_t a, int32x2_t b, int32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

int32x4_t vmlsq_laneq_s32 (int32x4_t a, int32x4_t b, int32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint16x4_t vmls_laneq_u16 (uint16x4_t a, uint16x4_t b, uint16x8_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint16x8_t vmlsq_laneq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint32x2_t vmls_laneq_u32 (uint32x2_t a, uint32x2_t b, uint32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsq_laneq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

A64

float32x2_t vmls_laneq_f32 (float32x2_t a, float32x2_t b, float32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * v[lane]) for i = 0 to 1

Argument Preparation

0 << lane << 3

Results

N/A → result

Supported architectures

A64

float32x4_t vmlsq_laneq_f32 (float32x4_t a, float32x4_t b, float32x4_t v, const int lane)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * v[lane]) for i = 0 to 3

Argument Preparation

0 << lane << 3

Results

N/A → result

Supported architectures

A64

int32x4_t vmlsl_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlsl_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsl_lane_u16 (uint32x4_t a, uint16x4_t b, uint16x4_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlsl_lane_u32 (uint64x2_t a, uint32x2_t b, uint32x2_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsl_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlsl_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsl_high_lane_u16 (uint32x4_t a, uint16x8_t b, uint16x4_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlsl_high_lane_u32 (uint64x2_t a, uint32x4_t b, uint32x2_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlsl_laneq_s16 (int32x4_t a, int16x4_t b, int16x8_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlsl_laneq_s32 (int64x2_t a, int32x2_t b, int32x4_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsl_laneq_u16 (uint32x4_t a, uint16x4_t b, uint16x8_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlsl_laneq_u32 (uint64x2_t a, uint32x2_t b, uint32x4_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vmlsl_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlsl_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t v, const int lane)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsl_high_laneq_u16 (uint32x4_t a, uint16x8_t b, uint16x8_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlsl_high_laneq_u32 (uint64x2_t a, uint32x4_t b, uint32x4_t v, const int lane)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_lane_s16 (int32x4_t a, int16x4_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlsl_lane_s32 (int64x2_t a, int32x2_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmlslh_lane_s16 (int32_t a, int16_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Sd 

b → Hn 

v → Vm.4H 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlsls_lane_s32 (int64_t a, int32_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Dd 

b → Sn 

v → Vm.2S 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_high_lane_s16 (int32x4_t a, int16x8_t b, int16x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlsl_high_lane_s32 (int64x2_t a, int32x4_t b, int32x2_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_laneq_s16 (int32x4_t a, int16x4_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlsl_laneq_s32 (int64x2_t a, int32x2_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmlslh_laneq_s16 (int32_t a, int16_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Sd 

b → Hn 

v → Vm.8H 

0 << lane << 7

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmlsls_laneq_s32 (int64_t a, int32_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Dd 

b → Sn 

v → Vm.4S 

0 << lane << 3

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_high_laneq_s16 (int32x4_t a, int16x8_t b, int16x8_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlsl_high_laneq_s32 (int64x2_t a, int32x4_t b, int32x4_t v, const int lane)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vmul_n_s16 (int16x4_t a, int16_t b)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmulq_n_s16 (int16x8_t a, int16_t b)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmul_n_s32 (int32x2_t a, int32_t b)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmulq_n_s32 (int32x4_t a, int32_t b)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmul_n_u16 (uint16x4_t a, uint16_t b)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmulq_n_u16 (uint16x8_t a, uint16_t b)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmul_n_u32 (uint32x2_t a, uint32_t b)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmulq_n_u32 (uint32x4_t a, uint32_t b)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmul_n_f32 (float32x2_t a, float32_t b)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vmulq_n_f32 (float32x4_t a, float32_t b)Floating-point multiply

Description

A64 Instruction

FMUL Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmul_n_f64 (float64x1_t a, float64_t b)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Vm.D[0]

Argument Preparation

a → Dn 

b → Vm.D[0]

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulq_n_f64 (float64x2_t a, float64_t b)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2D,Vn.2D,Vm.D[0]

Argument Preparation

a → Vn.2D 

b → Vm.D[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int16x4_t vmul_lane_s16 (int16x4_t a, int16x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmulq_lane_s16 (int16x8_t a, int16x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmul_lane_s32 (int32x2_t a, int32x2_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmulq_lane_s32 (int32x4_t a, int32x2_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmul_lane_u16 (uint16x4_t a, uint16x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmulq_lane_u16 (uint16x8_t a, uint16x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmul_lane_u32 (uint32x2_t a, uint32x2_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmulq_lane_u32 (uint32x4_t a, uint32x2_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmul_lane_f32 (float32x2_t a, float32x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vmulq_lane_f32 (float32x4_t a, float32x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmul_lane_f64 (float64x1_t a, float64x1_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulq_lane_f64 (float64x2_t a, float64x1_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vn.2D 

v → Vm.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmuls_lane_f32 (float32_t a, float32x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vmuld_lane_f64 (float64_t a, float64x1_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Vm.S[lane]

Argument Preparation

a → Dn 

v → Vm.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int16x4_t vmul_laneq_s16 (int16x4_t a, int16x8_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

int16x8_t vmulq_laneq_s16 (int16x8_t a, int16x8_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

int32x2_t vmul_laneq_s32 (int32x2_t a, int32x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

int32x4_t vmulq_laneq_s32 (int32x4_t a, int32x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

uint16x4_t vmul_laneq_u16 (uint16x4_t a, uint16x8_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

uint16x8_t vmulq_laneq_u16 (uint16x8_t a, uint16x8_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

uint32x2_t vmul_laneq_u32 (uint32x2_t a, uint32x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

uint32x4_t vmulq_laneq_u32 (uint32x4_t a, uint32x4_t v, const int lane)Multiply

Description

A64 Instruction

MUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if poly then
        product = PolynomialMult(element1, element2)<esize-1:0>;
    else
        product = (UInt(element1)*UInt(element2))<esize-1:0>;
    Elem[result, e, esize] = product;

V[d] = result;

Supported architectures

A64

float32x2_t vmul_laneq_f32 (float32x2_t a, float32x4_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vmulq_laneq_f32 (float32x4_t a, float32x4_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vmul_laneq_f64 (float64x1_t a, float64x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vmulq_laneq_f64 (float64x2_t a, float64x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Vd.2D,Vn.2D,Vm.D[lane]

Argument Preparation

a → Vn.2D 

v → Vm.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmuls_laneq_f32 (float32_t a, float32x4_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vmuld_laneq_f64 (float64_t a, float64x2_t v, const int lane)Floating-point multiply

Description

A64 Instruction

FMUL Dd,Dn,Vm.D[lane]

Argument Preparation

a → Dn 

v → Vm.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPMul(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int32x4_t vmull_n_s16 (int16x4_t a, int16_t b)Signed multiply long

Description

A64 Instruction

SMULL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmull_n_s32 (int32x2_t a, int32_t b)Signed multiply long

Description

A64 Instruction

SMULL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmull_n_u16 (uint16x4_t a, uint16_t b)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmull_n_u32 (uint32x2_t a, uint32_t b)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmull_high_n_s16 (int16x8_t a, int16_t b)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmull_high_n_s32 (int32x4_t a, int32_t b)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmull_high_n_u16 (uint16x8_t a, uint16_t b)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmull_high_n_u32 (uint32x4_t a, uint32_t b)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vmull_lane_s16 (int16x4_t a, int16x4_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmull_lane_s32 (int32x2_t a, int32x2_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmull_lane_u16 (uint16x4_t a, uint16x4_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmull_lane_u32 (uint32x2_t a, uint32x2_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmull_high_lane_s16 (int16x8_t a, int16x4_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmull_high_lane_s32 (int32x4_t a, int32x2_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmull_high_lane_u16 (uint16x8_t a, uint16x4_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmull_high_lane_u32 (uint32x4_t a, uint32x2_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vmull_laneq_s16 (int16x4_t a, int16x8_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmull_laneq_s32 (int32x2_t a, int32x4_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmull_laneq_u16 (uint16x4_t a, uint16x8_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmull_laneq_u32 (uint32x2_t a, uint32x4_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vmull_high_laneq_s16 (int16x8_t a, int16x8_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vmull_high_laneq_s32 (int32x4_t a, int32x4_t v, const int lane)Signed multiply long

Description

A64 Instruction

SMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vmull_high_laneq_u16 (uint16x8_t a, uint16x8_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

uint64x2_t vmull_high_laneq_u32 (uint32x4_t a, uint32x4_t v, const int lane)Unsigned multiply long

Description

A64 Instruction

UMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    Elem[result, e, 2*esize] = (element1*element2)<2*esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_n_s16 (int16x4_t a, int16_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmull_n_s32 (int32x2_t a, int32_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmull_high_n_s16 (int16x8_t a, int16_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmull_high_n_s32 (int32x4_t a, int32_t b)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_lane_s16 (int16x4_t a, int16x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmull_lane_s32 (int32x2_t a, int32x2_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32_t vqdmullh_lane_s16 (int16_t a, int16x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.4H 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmulls_lane_s32 (int32_t a, int32x2_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.2S 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_high_lane_s16 (int16x8_t a, int16x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmull_high_lane_s32 (int32x4_t a, int32x2_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_laneq_s16 (int16x4_t a, int16x8_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.4S,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmull_laneq_s32 (int32x2_t a, int32x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Vd.2D,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmullh_laneq_s16 (int16_t a, int16x8_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Sd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.8H 

0 << lane << 7

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqdmulls_laneq_s32 (int32_t a, int32x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL Dd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.4S 

0 << lane << 3

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmull_high_laneq_s16 (int16x8_t a, int16x8_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.4S,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmull_high_laneq_s32 (int32x4_t a, int32x4_t v, const int lane)Signed saturating doubling multiply long

Description

A64 Instruction

SQDMULL2 Vd.2D,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat) = SignedSatQ(2 * element1 * element2, 2 * esize);
    Elem[result, e, 2*esize] = product;
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vqdmulh_n_s16 (int16x4_t a, int16_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqdmulhq_n_s16 (int16x8_t a, int16_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqdmulh_n_s32 (int32x2_t a, int32_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmulhq_n_s32 (int32x4_t a, int32_t b)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqdmulh_lane_s16 (int16x4_t a, int16x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqdmulhq_lane_s16 (int16x8_t a, int16x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqdmulh_lane_s32 (int32x2_t a, int32x2_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmulhq_lane_s32 (int32x4_t a, int32x2_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16_t vqdmulhh_lane_s16 (int16_t a, int16x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Hd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmulhs_lane_s32 (int32_t a, int32x2_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Sd,Sn,Vm.H[lane]

Argument Preparation

a → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vqdmulh_laneq_s16 (int16x4_t a, int16x8_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x8_t vqdmulhq_laneq_s16 (int16x8_t a, int16x8_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x2_t vqdmulh_laneq_s32 (int32x2_t a, int32x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqdmulhq_laneq_s32 (int32x4_t a, int32x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqdmulhh_laneq_s16 (int16_t a, int16x8_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Hd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqdmulhs_laneq_s32 (int32_t a, int32x4_t v, const int lane)Signed saturating doubling multiply returning high half

Description

A64 Instruction

SQDMULH Sd,Sn,Vm.H[lane]

Argument Preparation

a → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vqrdmulh_n_s16 (int16x4_t a, int16_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vn.4H 

b → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqrdmulhq_n_s16 (int16x8_t a, int16_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vn.8H 

b → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqrdmulh_n_s32 (int32x2_t a, int32_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vn.2S 

b → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqrdmulhq_n_s32 (int32x4_t a, int32_t b)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vn.4S 

b → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqrdmulh_lane_s16 (int16x4_t a, int16x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqrdmulhq_lane_s16 (int16x8_t a, int16x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqrdmulh_lane_s32 (int32x2_t a, int32x2_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqrdmulhq_lane_s32 (int32x4_t a, int32x2_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16_t vqrdmulhh_lane_s16 (int16_t a, int16x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Hd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqrdmulhs_lane_s32 (int32_t a, int32x2_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vqrdmulh_laneq_s16 (int16x4_t a, int16x8_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4H,Vn.4H,Vm.H[lane]

Argument Preparation

a → Vn.4H 

v → Vm.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x8_t vqrdmulhq_laneq_s16 (int16x8_t a, int16x8_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.8H,Vn.8H,Vm.H[lane]

Argument Preparation

a → Vn.8H 

v → Vm.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x2_t vqrdmulh_laneq_s32 (int32x2_t a, int32x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.2S,Vn.2S,Vm.S[lane]

Argument Preparation

a → Vn.2S 

v → Vm.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32x4_t vqrdmulhq_laneq_s32 (int32x4_t a, int32x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Vd.4S,Vn.4S,Vm.S[lane]

Argument Preparation

a → Vn.4S 

v → Vm.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqrdmulhh_laneq_s16 (int16_t a, int16x8_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Hd,Hn,Vm.H[lane]

Argument Preparation

a → Hn 

v → Vm.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqrdmulhs_laneq_s32 (int32_t a, int32x4_t v, const int lane)Signed saturating rounding doubling multiply returning high half

Description

A64 Instruction

SQRDMULH Sd,Sn,Vm.S[lane]

Argument Preparation

a → Sn 

v → Vm.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
integer round_const = if rounding then 1 << (esize - 1) else 0;
integer element1;
integer element2;
integer product;
boolean sat;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    product = (2 * element1 * element2) + round_const;
    (Elem[result, e, esize], sat) = SignedSatQ(product >> esize, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vmla_n_s16 (int16x4_t a, int16x4_t b, int16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlaq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmla_n_s32 (int32x2_t a, int32x2_t b, int32_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlaq_n_s32 (int32x4_t a, int32x4_t b, int32_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmla_n_u16 (uint16x4_t a, uint16x4_t b, uint16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlaq_n_u16 (uint16x8_t a, uint16x8_t b, uint16_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmla_n_u32 (uint32x2_t a, uint32x2_t b, uint32_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlaq_n_u32 (uint32x4_t a, uint32x4_t b, uint32_t c)Multiply-add to accumulator

Description

A64 Instruction

MLA Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmla_n_f32 (float32x2_t a, float32x2_t b, float32_t c)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlaq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)Undefined

Description

A64 Instruction

RESULT[I] = a[i] + (b[i] * c) for i = 0 to 3

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

int32x4_t vmlal_n_s16 (int32x4_t a, int16x4_t b, int16_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlal_n_s32 (int64x2_t a, int32x2_t b, int32_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlal_n_u16 (uint32x4_t a, uint16x4_t b, uint16_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlal_n_u32 (uint64x2_t a, uint32x2_t b, uint32_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlal_high_n_s16 (int32x4_t a, int16x8_t b, int16_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlal_high_n_s32 (int64x2_t a, int32x4_t b, int32_t c)Signed multiply-add long

Description

A64 Instruction

SMLAL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlal_high_n_u16 (uint32x4_t a, uint16x8_t b, uint16_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlal_high_n_u32 (uint64x2_t a, uint32x4_t b, uint32_t c)Unsigned multiply-add long

Description

A64 Instruction

UMLAL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlal_n_s16 (int32x4_t a, int16x4_t b, int16_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlal_n_s32 (int64x2_t a, int32x2_t b, int32_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmlal_high_n_s16 (int32x4_t a, int16x8_t b, int16_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlal_high_n_s32 (int64x2_t a, int32x4_t b, int32_t c)Signed saturating doubling multiply-add long

Description

A64 Instruction

SQDMLAL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16x4_t vmls_n_s16 (int16x4_t a, int16x4_t b, int16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmlsq_n_s16 (int16x8_t a, int16x8_t b, int16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmls_n_s32 (int32x2_t a, int32x2_t b, int32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsq_n_s32 (int32x4_t a, int32x4_t b, int32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmls_n_u16 (uint16x4_t a, uint16x4_t b, uint16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4H,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4H 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmlsq_n_u16 (uint16x8_t a, uint16x8_t b, uint16_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.8H,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.8H 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmls_n_u32 (uint32x2_t a, uint32x2_t b, uint32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsq_n_u32 (uint32x4_t a, uint32x4_t b, uint32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

MLS Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;
bits(esize) product;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    product = (UInt(element1)*UInt(element2))<esize-1:0>;
    if sub_op then
        Elem[result, e, esize] = Elem[operand3, e, esize] - product;
    else
        Elem[result, e, esize] = Elem[operand3, e, esize] + product;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmls_n_f32 (float32x2_t a, float32x2_t b, float32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c) for i = 0 to 1

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

float32x4_t vmlsq_n_f32 (float32x4_t a, float32x4_t b, float32_t c)Multiply-subtract from accumulator

Description

A64 Instruction

RESULT[I] = a[i] - (b[i] * c) for i = 0 to 3

Argument Preparation

a → N/A 

b → N/A 

c → N/A

Results

N/A → result

Supported architectures

v7/A32/A64

int32x4_t vmlsl_n_s16 (int32x4_t a, int16x4_t b, int16_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmlsl_n_s32 (int64x2_t a, int32x2_t b, int32_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmlsl_n_u16 (uint32x4_t a, uint16x4_t b, uint16_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmlsl_n_u32 (uint64x2_t a, uint32x2_t b, uint32_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmlsl_high_n_s16 (int32x4_t a, int16x8_t b, int16_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int64x2_t vmlsl_high_n_s32 (int64x2_t a, int32x4_t b, int32_t c)Signed multiply-subtract long

Description

A64 Instruction

SMLSL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint32x4_t vmlsl_high_n_u16 (uint32x4_t a, uint16x8_t b, uint16_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

uint64x2_t vmlsl_high_n_u32 (uint64x2_t a, uint32x4_t b, uint32_t c)Unsigned multiply-subtract long

Description

A64 Instruction

UMLSL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
bits(2*esize) accum;

for e = 0 to elements-1
    element1 = Int(Elem[operand1, e, esize], unsigned);
    element2 = Int(Elem[operand2, e, esize], unsigned);
    product = (element1*element2)<2*esize-1:0>;
    if sub_op then
        accum = Elem[operand3, e, 2*esize] - product;
    else
        accum = Elem[operand3, e, 2*esize] + product;
    Elem[result, e, 2*esize] = accum;

V[d] = result;

Supported architectures

A64

int32x4_t vqdmlsl_n_s16 (int32x4_t a, int16x4_t b, int16_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.4S,Vn.4H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.4H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vqdmlsl_n_s32 (int64x2_t a, int32x2_t b, int32_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL Vd.2D,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.2S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqdmlsl_high_n_s16 (int32x4_t a, int16x8_t b, int16_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.4S,Vn.8H,Vm.H[0]

Argument Preparation

a → Vd.4S 

b → Vn.8H 

c → Vm.H[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqdmlsl_high_n_s32 (int64x2_t a, int32x4_t b, int32_t c)Signed saturating doubling multiply-subtract long

Description

A64 Instruction

SQDMLSL2 Vd.2D,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.2D 

b → Vn.4S 

c → Vm.S[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;

for e = 0 to elements-1
    element1 = SInt(Elem[operand1, e, esize]);
    element2 = SInt(Elem[operand2, e, esize]);
    (product, sat1) = SignedSatQ(2 * element1 * element2, 2 * esize);
    if sub_op then
        accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
    else
        accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
    (Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2 * esize);
    if sat1 || sat2 then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vabs_s8 (int8x8_t a)Absolute value

Description

Absolute value (vector). This instruction calculates the absolute value of each vector element in the source SIMD&FP register, puts the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ABS Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vabsq_s8 (int8x16_t a)Absolute value

Description

A64 Instruction

ABS Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vabs_s16 (int16x4_t a)Absolute value

Description

A64 Instruction

ABS Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vabsq_s16 (int16x8_t a)Absolute value

Description

A64 Instruction

ABS Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vabs_s32 (int32x2_t a)Absolute value

Description

A64 Instruction

ABS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vabsq_s32 (int32x4_t a)Absolute value

Description

A64 Instruction

ABS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vabs_f32 (float32x2_t a)Floating-point absolute value

Description

Floating-point Absolute value (vector). This instruction calculates the absolute value of each vector element in the source SIMD&FP register, writes the result to a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FABS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vabsq_f32 (float32x4_t a)Floating-point absolute value

Description

A64 Instruction

FABS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vabs_s64 (int64x1_t a)Absolute value

Description

A64 Instruction

ABS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64_t vabsd_s64 (int64_t a)Absolute value

Description

A64 Instruction

ABS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vabsq_s64 (int64x2_t a)Absolute value

Description

A64 Instruction

ABS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

float64x1_t vabs_f64 (float64x1_t a)Floating-point absolute value

Description

A64 Instruction

FABS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

A64

float64x2_t vabsq_f64 (float64x2_t a)Floating-point absolute value

Description

A64 Instruction

FABS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

A64

int8x8_t vqabs_s8 (int8x8_t a)Signed saturating absolute value

Description

Signed saturating Absolute value. This instruction reads each vector element from the source SIMD&FP register, puts the absolute value of the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed integer values.

A64 Instruction

SQABS Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqabsq_s8 (int8x16_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqabs_s16 (int16x4_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqabsq_s16 (int16x8_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqabs_s32 (int32x2_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqabsq_s32 (int32x4_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqabs_s64 (int64x1_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqabsq_s64 (int64x2_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8_t vqabsb_s8 (int8_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Bd,Bn

Argument Preparation

a → Bn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqabsh_s16 (int16_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Hd,Hn

Argument Preparation

a → Hn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqabss_s32 (int32_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqabsd_s64 (int64_t a)Signed saturating absolute value

Description

A64 Instruction

SQABS Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vneg_s8 (int8x8_t a)Negate

Description

Negate (vector). This instruction reads each vector element from the source SIMD&FP register, negates each value, puts the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

NEG Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vnegq_s8 (int8x16_t a)Negate

Description

A64 Instruction

NEG Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vneg_s16 (int16x4_t a)Negate

Description

A64 Instruction

NEG Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vnegq_s16 (int16x8_t a)Negate

Description

A64 Instruction

NEG Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vneg_s32 (int32x2_t a)Negate

Description

A64 Instruction

NEG Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vnegq_s32 (int32x4_t a)Negate

Description

A64 Instruction

NEG Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vneg_f32 (float32x2_t a)Floating-point negate

Description

Floating-point Negate (vector). This instruction negates the value of each vector element in the source SIMD&FP register, writes the result to a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FNEG Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vnegq_f32 (float32x4_t a)Floating-point negate

Description

A64 Instruction

FNEG Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vneg_s64 (int64x1_t a)Negate

Description

A64 Instruction

NEG Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64_t vnegd_s64 (int64_t a)Negate

Description

A64 Instruction

NEG Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

int64x2_t vnegq_s64 (int64x2_t a)Negate

Description

A64 Instruction

NEG Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    Elem[result, e, esize] = element<esize-1:0>;

V[d] = result;

Supported architectures

A64

float64x1_t vneg_f64 (float64x1_t a)Floating-point negate

Description

A64 Instruction

FNEG Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

A64

float64x2_t vnegq_f64 (float64x2_t a)Floating-point negate

Description

A64 Instruction

FNEG Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    if neg then
        element = FPNeg(element);
    else
        element = FPAbs(element);
    Elem[result, e, esize] = element;

V[d] = result;

Supported architectures

A64

int8x8_t vqneg_s8 (int8x8_t a)Signed saturating negate

Description

Signed saturating Negate. This instruction reads each vector element from the source SIMD&FP register, negates each value, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are signed integer values.

A64 Instruction

SQNEG Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vqnegq_s8 (int8x16_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vqneg_s16 (int16x4_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vqnegq_s16 (int16x8_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vqneg_s32 (int32x2_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vqnegq_s32 (int32x4_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vqneg_s64 (int64x1_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64x2_t vqnegq_s64 (int64x2_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8_t vqnegb_s8 (int8_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Bd,Bn

Argument Preparation

a → Bn

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int16_t vqnegh_s16 (int16_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Hd,Hn

Argument Preparation

a → Hn

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int32_t vqnegs_s32 (int32_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int64_t vqnegd_s64 (int64_t a)Signed saturating negate

Description

A64 Instruction

SQNEG Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element;
boolean sat;

for e = 0 to elements-1
    element = SInt(Elem[operand, e, esize]);
    if neg then
        element = -element;
    else
        element = Abs(element);
    (Elem[result, e, esize], sat) = SignedSatQ(element, esize);
    if sat then FPSR.QC = '1';

V[d] = result;

Supported architectures

A64

int8x8_t vcls_s8 (int8x8_t a)Count leading sign bits

Description

Count Leading Sign bits (vector). This instruction counts the number of consecutive bits following the most significant bit that are the same as the most significant bit in each vector element in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. The count does not include the most significant bit itself.

A64 Instruction

CLS Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vclsq_s8 (int8x16_t a)Count leading sign bits

Description

A64 Instruction

CLS Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vcls_s16 (int16x4_t a)Count leading sign bits

Description

A64 Instruction

CLS Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vclsq_s16 (int16x8_t a)Count leading sign bits

Description

A64 Instruction

CLS Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vcls_s32 (int32x2_t a)Count leading sign bits

Description

A64 Instruction

CLS Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vclsq_s32 (int32x4_t a)Count leading sign bits

Description

A64 Instruction

CLS Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vclz_s8 (int8x8_t a)Count leading zero bits

Description

Count Leading Zero bits (vector). This instruction counts the number of consecutive zeros, starting from the most significant bit, in each vector element in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

CLZ Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vclzq_s8 (int8x16_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vclz_s16 (int16x4_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vclzq_s16 (int16x8_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vclz_s32 (int32x2_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vclzq_s32 (int32x4_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vclz_u8 (uint8x8_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vclzq_u8 (uint8x16_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vclz_u16 (uint16x4_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.4H,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vclzq_u16 (uint16x8_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.8H,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vclz_u32 (uint32x2_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vclzq_u32 (uint32x4_t a)Count leading zero bits

Description

A64 Instruction

CLZ Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    if countop == CountOp_CLS then
        count = CountLeadingSignBits(Elem[operand, e, esize]);
    else
        count = CountLeadingZeroBits(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vcnt_s8 (int8x8_t a)Population count per byte

Description

Population Count per byte. This instruction counts the number of bits that have a value of one in each vector element in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

CNT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vcntq_s8 (int8x16_t a)Population count per byte

Description

A64 Instruction

CNT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vcnt_u8 (uint8x8_t a)Population count per byte

Description

A64 Instruction

CNT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcntq_u8 (uint8x16_t a)Population count per byte

Description

A64 Instruction

CNT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vcnt_p8 (poly8x8_t a)Population count per byte

Description

A64 Instruction

CNT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vcntq_p8 (poly8x16_t a)Population count per byte

Description

A64 Instruction

CNT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

integer count;
for e = 0 to elements-1
    count = BitCount(Elem[operand, e, esize]);
    Elem[result, e, esize] = count<esize-1:0>;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrecpe_u32 (uint32x2_t a)Unsigned reciprocal estimate

Description

Unsigned Reciprocal Estimate. This instruction reads each vector element from the source SIMD&FP register, calculates an approximate inverse for the unsigned integer value, places the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

URECPE Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(32) element;

for e = 0 to elements-1
    element = Elem[operand, e, 32];
    Elem[result, e, 32] = UnsignedRecipEstimate(element);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrecpeq_u32 (uint32x4_t a)Unsigned reciprocal estimate

Description

A64 Instruction

URECPE Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(32) element;

for e = 0 to elements-1
    element = Elem[operand, e, 32];
    Elem[result, e, 32] = UnsignedRecipEstimate(element);

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vrecpe_f32 (float32x2_t a)Floating-point reciprocal estimate

Description

Floating-point Reciprocal Estimate. This instruction finds an approximate reciprocal estimate for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FRECPE Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vrecpeq_f32 (float32x4_t a)Floating-point reciprocal estimate

Description

A64 Instruction

FRECPE Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vrecpe_f64 (float64x1_t a)Floating-point reciprocal estimate

Description

A64 Instruction

FRECPE Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vrecpeq_f64 (float64x2_t a)Floating-point reciprocal estimate

Description

A64 Instruction

FRECPE Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vrecpes_f32 (float32_t a)Floating-point reciprocal estimate

Description

A64 Instruction

FRECPE Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vrecped_f64 (float64_t a)Floating-point reciprocal estimate

Description

A64 Instruction

FRECPE Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecipEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vrecps_f32 (float32x2_t a, float32x2_t b)Floating-point reciprocal step

Description

Floating-point Reciprocal Step. This instruction multiplies the corresponding floating-point values in the vectors of the two source SIMD&FP registers, subtracts each of the products from 2.0, places the resulting floating-point values in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FRECPS Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vrecpsq_f32 (float32x4_t a, float32x4_t b)Floating-point reciprocal step

Description

A64 Instruction

FRECPS Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vrecps_f64 (float64x1_t a, float64x1_t b)Floating-point reciprocal step

Description

A64 Instruction

FRECPS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float64x2_t vrecpsq_f64 (float64x2_t a, float64x2_t b)Floating-point reciprocal step

Description

A64 Instruction

FRECPS Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float32_t vrecpss_f32 (float32_t a, float32_t b)Floating-point reciprocal step

Description

A64 Instruction

FRECPS Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float64_t vrecpsd_f64 (float64_t a, float64_t b)Floating-point reciprocal step

Description

A64 Instruction

FRECPS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRecipStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float32x2_t vsqrt_f32 (float32x2_t a)Floating-point square root

Description

Floating-point Square Root (vector). This instruction calculates the square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FSQRT Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPSqrt(element, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vsqrtq_f32 (float32x4_t a)Floating-point square root

Description

A64 Instruction

FSQRT Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPSqrt(element, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vsqrt_f64 (float64x1_t a)Floating-point square root

Description

A64 Instruction

FSQRT Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPSqrt(element, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vsqrtq_f64 (float64x2_t a)Floating-point square root

Description

A64 Instruction

FSQRT Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPSqrt(element, FPCR);

V[d] = result;

Supported architectures

A64

uint32x2_t vrsqrte_u32 (uint32x2_t a)Unsigned reciprocal square root estimate

Description

Unsigned Reciprocal Square Root Estimate. This instruction reads each vector element from the source SIMD&FP register, calculates an approximate inverse square root for each value, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are unsigned integer values.

A64 Instruction

URSQRTE Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(32) element;

for e = 0 to elements-1
    element = Elem[operand, e, 32];
    Elem[result, e, 32] = UnsignedRSqrtEstimate(element);

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrsqrteq_u32 (uint32x4_t a)Unsigned reciprocal square root estimate

Description

A64 Instruction

URSQRTE Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(32) element;

for e = 0 to elements-1
    element = Elem[operand, e, 32];
    Elem[result, e, 32] = UnsignedRSqrtEstimate(element);

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vrsqrte_f32 (float32x2_t a)Floating-point reciprocal square root estimate

Description

Floating-point Reciprocal Square Root Estimate. This instruction calculates an approximate square root for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FRSQRTE Vd.2S,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vrsqrteq_f32 (float32x4_t a)Floating-point reciprocal square root estimate

Description

A64 Instruction

FRSQRTE Vd.4S,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vrsqrte_f64 (float64x1_t a)Floating-point reciprocal square root estimate

Description

A64 Instruction

FRSQRTE Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vrsqrteq_f64 (float64x2_t a)Floating-point reciprocal square root estimate

Description

A64 Instruction

FRSQRTE Vd.2D,Vn.2D

Argument Preparation

a → Vn.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vrsqrtes_f32 (float32_t a)Floating-point reciprocal square root estimate

Description

A64 Instruction

FRSQRTE Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vrsqrted_f64 (float64_t a)Floating-point reciprocal square root estimate

Description

A64 Instruction

FRSQRTE Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRSqrtEstimate(element, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vrsqrts_f32 (float32x2_t a, float32x2_t b)Floating-point reciprocal square root step

Description

Floating-point Reciprocal Square Root Step. This instruction multiplies corresponding floating-point values in the vectors of the two source SIMD&FP registers, subtracts each of the products from 3.0, divides these results by 2.0, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FRSQRTS Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vrsqrtsq_f32 (float32x4_t a, float32x4_t b)Floating-point reciprocal square root step

Description

A64 Instruction

FRSQRTS Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vrsqrts_f64 (float64x1_t a, float64x1_t b)Floating-point reciprocal square root step

Description

A64 Instruction

FRSQRTS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float64x2_t vrsqrtsq_f64 (float64x2_t a, float64x2_t b)Floating-point reciprocal square root step

Description

A64 Instruction

FRSQRTS Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float32_t vrsqrtss_f32 (float32_t a, float32_t b)Floating-point reciprocal square root step

Description

A64 Instruction

FRSQRTS Sd,Sn,Sm

Argument Preparation

a → Sn 

b → Sm

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

float64_t vrsqrtsd_f64 (float64_t a, float64_t b)Floating-point reciprocal square root step

Description

A64 Instruction

FRSQRTS Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPRSqrtStepFused(element1, element2);

V[d] = result;

Supported architectures

A64

int8x8_t vmvn_s8 (int8x8_t a)Bitwise NOT

Description

Bitwise NOT (vector). This instruction reads each vector element from the source SIMD&FP register, places the inverse of each value into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int8x16_t vmvnq_s8 (int8x16_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int16x4_t vmvn_s16 (int16x4_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int16x8_t vmvnq_s16 (int16x8_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int32x2_t vmvn_s32 (int32x2_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int32x4_t vmvnq_s32 (int32x4_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint8x8_t vmvn_u8 (uint8x8_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint8x16_t vmvnq_u8 (uint8x16_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint16x4_t vmvn_u16 (uint16x4_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint16x8_t vmvnq_u16 (uint16x8_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint32x2_t vmvn_u32 (uint32x2_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

uint32x4_t vmvnq_u32 (uint32x4_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

poly8x8_t vmvn_p8 (poly8x8_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

poly8x16_t vmvnq_p8 (poly8x16_t a)Bitwise NOT

Description

A64 Instruction

MVN Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

The description of NOT gives the operational pseudocode for this instruction.

Supported architectures

v7/A32/A64

int8x8_t vand_s8 (int8x8_t a, int8x8_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vandq_s8 (int8x16_t a, int8x16_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vand_s16 (int16x4_t a, int16x4_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vandq_s16 (int16x8_t a, int16x8_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vand_s32 (int32x2_t a, int32x2_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vandq_s32 (int32x4_t a, int32x4_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vand_s64 (int64x1_t a, int64x1_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Dd,Dn,Dm

Argument Preparation

a → Dn 

b → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vandq_s64 (int64x2_t a, int64x2_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vand_u8 (uint8x8_t a, uint8x8_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vandq_u8 (uint8x16_t a, uint8x16_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vand_u16 (uint16x4_t a, uint16x4_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vandq_u16 (uint16x8_t a, uint16x8_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vand_u32 (uint32x2_t a, uint32x2_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vandq_u32 (uint32x4_t a, uint32x4_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vand_u64 (uint64x1_t a, uint64x1_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vandq_u64 (uint64x2_t a, uint64x2_t b)Bitwise AND

Description

Bitwise AND (vector). This instruction performs a bitwise AND between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

AND Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vorr_s8 (int8x8_t a, int8x8_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vorrq_s8 (int8x16_t a, int8x16_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vorr_s16 (int16x4_t a, int16x4_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vorrq_s16 (int16x8_t a, int16x8_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vorr_s32 (int32x2_t a, int32x2_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vorrq_s32 (int32x4_t a, int32x4_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vorr_s64 (int64x1_t a, int64x1_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vorrq_s64 (int64x2_t a, int64x2_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vorr_u8 (uint8x8_t a, uint8x8_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vorrq_u8 (uint8x16_t a, uint8x16_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vorr_u16 (uint16x4_t a, uint16x4_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vorrq_u16 (uint16x8_t a, uint16x8_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vorr_u32 (uint32x2_t a, uint32x2_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vorrq_u32 (uint32x4_t a, uint32x4_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vorr_u64 (uint64x1_t a, uint64x1_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vorrq_u64 (uint64x2_t a, uint64x2_t b)Bitwise inclusive OR

Description

Bitwise inclusive OR (vector, register). This instruction performs a bitwise OR between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t veor_s8 (int8x8_t a, int8x8_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int8x16_t veorq_s8 (int8x16_t a, int8x16_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int16x4_t veor_s16 (int16x4_t a, int16x4_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int16x8_t veorq_s16 (int16x8_t a, int16x8_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int32x2_t veor_s32 (int32x2_t a, int32x2_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int32x4_t veorq_s32 (int32x4_t a, int32x4_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int64x1_t veor_s64 (int64x1_t a, int64x1_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int64x2_t veorq_s64 (int64x2_t a, int64x2_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x8_t veor_u8 (uint8x8_t a, uint8x8_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x16_t veorq_u8 (uint8x16_t a, uint8x16_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint16x4_t veor_u16 (uint16x4_t a, uint16x4_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint16x8_t veorq_u16 (uint16x8_t a, uint16x8_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint32x2_t veor_u32 (uint32x2_t a, uint32x2_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint32x4_t veorq_u32 (uint32x4_t a, uint32x4_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint64x1_t veor_u64 (uint64x1_t a, uint64x1_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint64x2_t veorq_u64 (uint64x2_t a, uint64x2_t b)Bitwise exclusive OR

Description

Bitwise Exclusive OR (vector). This instruction performs a bitwise Exclusive OR operation between the two source SIMD&FP registers, and places the result in the destination SIMD&FP register.

A64 Instruction

EOR Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand2;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand2 = Zeros();
operand3 = Ones();
V[d] = operand1 EOR ((operand2 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int8x8_t vbic_s8 (int8x8_t a, int8x8_t b)Bitwise bit clear

Description

Bitwise bit Clear (vector, register). This instruction performs a bitwise AND between the first source SIMD&FP register and the complement of the second source SIMD&FP register, and writes the result to the destination SIMD&FP register.

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vbicq_s8 (int8x16_t a, int8x16_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vbic_s16 (int16x4_t a, int16x4_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vbicq_s16 (int16x8_t a, int16x8_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vbic_s32 (int32x2_t a, int32x2_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vbicq_s32 (int32x4_t a, int32x4_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vbic_s64 (int64x1_t a, int64x1_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vbicq_s64 (int64x2_t a, int64x2_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vbic_u8 (uint8x8_t a, uint8x8_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vbicq_u8 (uint8x16_t a, uint8x16_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vbic_u16 (uint16x4_t a, uint16x4_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vbicq_u16 (uint16x8_t a, uint16x8_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vbic_u32 (uint32x2_t a, uint32x2_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vbicq_u32 (uint32x4_t a, uint32x4_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vbic_u64 (uint64x1_t a, uint64x1_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vbicq_u64 (uint64x2_t a, uint64x2_t b)Bitwise bit clear

Description

A64 Instruction

BIC Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 AND operand2;
V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vorn_s8 (int8x8_t a, int8x8_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vornq_s8 (int8x16_t a, int8x16_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vorn_s16 (int16x4_t a, int16x4_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vornq_s16 (int16x8_t a, int16x8_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vorn_s32 (int32x2_t a, int32x2_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vornq_s32 (int32x4_t a, int32x4_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vorn_s64 (int64x1_t a, int64x1_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vornq_s64 (int64x2_t a, int64x2_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vorn_u8 (uint8x8_t a, uint8x8_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vornq_u8 (uint8x16_t a, uint8x16_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vorn_u16 (uint16x4_t a, uint16x4_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vornq_u16 (uint16x8_t a, uint16x8_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vorn_u32 (uint32x2_t a, uint32x2_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vornq_u32 (uint32x4_t a, uint32x4_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vorn_u64 (uint64x1_t a, uint64x1_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vornq_u64 (uint64x2_t a, uint64x2_t b)Bitwise inclusive OR NOT

Description

Bitwise inclusive OR NOT (vector). This instruction performs a bitwise OR NOT between the two source SIMD&FP registers, and writes the result to the destination SIMD&FP register.

A64 Instruction

ORN Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

operand2 = NOT(operand2);

result = operand1 OR operand2;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vbsl_s8 (uint8x8_t a, int8x8_t b, int8x8_t c)Bitwise select

Description

Bitwise Select. This instruction sets each bit in the destination SIMD&FP register to the corresponding bit from the first source SIMD&FP register when the original destination bit was 1, otherwise from the second source SIMD&FP register.

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int8x16_t vbslq_s8 (uint8x16_t a, int8x16_t b, int8x16_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int16x4_t vbsl_s16 (uint16x4_t a, int16x4_t b, int16x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int16x8_t vbslq_s16 (uint16x8_t a, int16x8_t b, int16x8_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int32x2_t vbsl_s32 (uint32x2_t a, int32x2_t b, int32x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int32x4_t vbslq_s32 (uint32x4_t a, int32x4_t b, int32x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int64x1_t vbsl_s64 (uint64x1_t a, int64x1_t b, int64x1_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int64x2_t vbslq_s64 (uint64x2_t a, int64x2_t b, int64x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x8_t vbsl_u8 (uint8x8_t a, uint8x8_t b, uint8x8_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x16_t vbslq_u8 (uint8x16_t a, uint8x16_t b, uint8x16_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint16x4_t vbsl_u16 (uint16x4_t a, uint16x4_t b, uint16x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint16x8_t vbslq_u16 (uint16x8_t a, uint16x8_t b, uint16x8_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint32x2_t vbsl_u32 (uint32x2_t a, uint32x2_t b, uint32x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint32x4_t vbslq_u32 (uint32x4_t a, uint32x4_t b, uint32x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint64x1_t vbsl_u64 (uint64x1_t a, uint64x1_t b, uint64x1_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint64x2_t vbslq_u64 (uint64x2_t a, uint64x2_t b, uint64x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly64x1_t vbsl_p64 (poly64x1_t a, poly64x1_t b, poly64x1_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

A32/A64

poly64x2_t vbslq_p64 (poly64x2_t a, poly64x2_t b, poly64x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

A32/A64

float32x2_t vbsl_f32 (uint32x2_t a, float32x2_t b, float32x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

float32x4_t vbslq_f32 (uint32x4_t a, float32x4_t b, float32x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly8x8_t vbsl_p8 (uint8x8_t a, poly8x8_t b, poly8x8_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly8x16_t vbslq_p8 (uint8x16_t a, poly8x16_t b, poly8x16_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly16x4_t vbsl_p16 (uint16x4_t a, poly16x4_t b, poly16x4_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly16x8_t vbslq_p16 (uint16x8_t a, poly16x8_t b, poly16x8_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

float64x1_t vbsl_f64 (uint64x1_t a, float64x1_t b, float64x1_t c)Bitwise select

Description

A64 Instruction

BSL Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vd.8B 

b → Vn.8B 

c → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

A64

float64x2_t vbslq_f64 (uint64x2_t a, float64x2_t b, float64x2_t c)Bitwise select

Description

A64 Instruction

BSL Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vd.16B 

b → Vn.16B 

c → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[m];
operand3 = V[d];
V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

A64

int8x8_t vcopy_lane_s8 (int8x8_t a, const int lane1, int8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x16_t vcopyq_lane_s8 (int8x16_t a, const int lane1, int8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int16x4_t vcopy_lane_s16 (int16x4_t a, const int lane1, int16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int16x8_t vcopyq_lane_s16 (int16x8_t a, const int lane1, int16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int32x2_t vcopy_lane_s32 (int32x2_t a, const int lane1, int32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int32x4_t vcopyq_lane_s32 (int32x4_t a, const int lane1, int32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int64x1_t vcopy_lane_s64 (int64x1_t a, const int lane1, int64x1_t b, const int lane2)Duplicate general-purpose register to vector

Description

Duplicate general-purpose register to vector. This instruction duplicates the contents of the source general-purpose register into a scalar or each element in a vector, and writes the result to the SIMD&FP destination register.

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.1D 

0 << lane2 << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64x2_t vcopyq_lane_s64 (int64x2_t a, const int lane1, int64x1_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.1D 

0 << lane2 << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x8_t vcopy_lane_u8 (uint8x8_t a, const int lane1, uint8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x16_t vcopyq_lane_u8 (uint8x16_t a, const int lane1, uint8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x4_t vcopy_lane_u16 (uint16x4_t a, const int lane1, uint16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x8_t vcopyq_lane_u16 (uint16x8_t a, const int lane1, uint16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x2_t vcopy_lane_u32 (uint32x2_t a, const int lane1, uint32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x4_t vcopyq_lane_u32 (uint32x4_t a, const int lane1, uint32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x1_t vcopy_lane_u64 (uint64x1_t a, const int lane1, uint64x1_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.1D 

0 << lane2 << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x2_t vcopyq_lane_u64 (uint64x2_t a, const int lane1, uint64x1_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.1D 

0 << lane2 << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly64x1_t vcopy_lane_p64 (poly64x1_t a, const int lane1, poly64x1_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.1D 

0 << lane2 << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vcopyq_lane_p64 (poly64x2_t a, const int lane1, poly64x1_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.1D 

0 << lane2 << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float32x2_t vcopy_lane_f32 (float32x2_t a, const int lane1, float32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float32x4_t vcopyq_lane_f32 (float32x4_t a, const int lane1, float32x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.2S 

0 << lane2 << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float64x1_t vcopy_lane_f64 (float64x1_t a, const int lane1, float64x1_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.1D 

0 << lane2 << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vcopyq_lane_f64 (float64x2_t a, const int lane1, float64x1_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.1D 

0 << lane2 << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x8_t vcopy_lane_p8 (poly8x8_t a, const int lane1, poly8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x16_t vcopyq_lane_p8 (poly8x16_t a, const int lane1, poly8x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.8B 

0 << lane2 << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x4_t vcopy_lane_p16 (poly16x4_t a, const int lane1, poly16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x8_t vcopyq_lane_p16 (poly16x8_t a, const int lane1, poly16x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.4H 

0 << lane2 << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vcopy_laneq_s8 (int8x8_t a, const int lane1, int8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x16_t vcopyq_laneq_s8 (int8x16_t a, const int lane1, int8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int16x4_t vcopy_laneq_s16 (int16x4_t a, const int lane1, int16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int16x8_t vcopyq_laneq_s16 (int16x8_t a, const int lane1, int16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int32x2_t vcopy_laneq_s32 (int32x2_t a, const int lane1, int32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int32x4_t vcopyq_laneq_s32 (int32x4_t a, const int lane1, int32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int64x1_t vcopy_laneq_s64 (int64x1_t a, const int lane1, int64x2_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.2D 

0 << lane2 << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64x2_t vcopyq_laneq_s64 (int64x2_t a, const int lane1, int64x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.2D 

0 << lane2 << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x8_t vcopy_laneq_u8 (uint8x8_t a, const int lane1, uint8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x16_t vcopyq_laneq_u8 (uint8x16_t a, const int lane1, uint8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x4_t vcopy_laneq_u16 (uint16x4_t a, const int lane1, uint16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x8_t vcopyq_laneq_u16 (uint16x8_t a, const int lane1, uint16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x2_t vcopy_laneq_u32 (uint32x2_t a, const int lane1, uint32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x4_t vcopyq_laneq_u32 (uint32x4_t a, const int lane1, uint32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x1_t vcopy_laneq_u64 (uint64x1_t a, const int lane1, uint64x2_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.2D 

0 << lane2 << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x2_t vcopyq_laneq_u64 (uint64x2_t a, const int lane1, uint64x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.2D 

0 << lane2 << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly64x1_t vcopy_laneq_p64 (poly64x1_t a, const int lane1, poly64x2_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.2D 

0 << lane2 << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vcopyq_laneq_p64 (poly64x2_t a, const int lane1, poly64x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.2D 

0 << lane2 << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float32x2_t vcopy_laneq_f32 (float32x2_t a, const int lane1, float32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.2S 

0 << lane1 << 1 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float32x4_t vcopyq_laneq_f32 (float32x4_t a, const int lane1, float32x4_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane1],Vn.S[lane2]

Argument Preparation

a → Vd.4S 

0 << lane1 << 3 

b → Vn.4S 

0 << lane2 << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float64x1_t vcopy_laneq_f64 (float64x1_t a, const int lane1, float64x2_t b, const int lane2)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane2]

Argument Preparation

a → UNUSED 

0 << lane1 << 0 

b → Vn.2D 

0 << lane2 << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vcopyq_laneq_f64 (float64x2_t a, const int lane1, float64x2_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane1],Vn.D[lane2]

Argument Preparation

a → Vd.2D 

0 << lane1 << 1 

b → Vn.2D 

0 << lane2 << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x8_t vcopy_laneq_p8 (poly8x8_t a, const int lane1, poly8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.8B 

0 << lane1 << 7 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x16_t vcopyq_laneq_p8 (poly8x16_t a, const int lane1, poly8x16_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane1],Vn.B[lane2]

Argument Preparation

a → Vd.16B 

0 << lane1 << 15 

b → Vn.16B 

0 << lane2 << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x4_t vcopy_laneq_p16 (poly16x4_t a, const int lane1, poly16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.4H 

0 << lane1 << 3 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x8_t vcopyq_laneq_p16 (poly16x8_t a, const int lane1, poly16x8_t b, const int lane2)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane1],Vn.H[lane2]

Argument Preparation

a → Vd.8H 

0 << lane1 << 7 

b → Vn.8H 

0 << lane2 << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vrbit_s8 (int8x8_t a)Reverse bit order

Description

Reverse Bit order (vector). This instruction reads each vector element from the source SIMD&FP register, reverses the bits of the element, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

RBIT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

int8x16_t vrbitq_s8 (int8x16_t a)Reverse bit order

Description

A64 Instruction

RBIT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

uint8x8_t vrbit_u8 (uint8x8_t a)Reverse bit order

Description

A64 Instruction

RBIT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

uint8x16_t vrbitq_u8 (uint8x16_t a)Reverse bit order

Description

A64 Instruction

RBIT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

poly8x8_t vrbit_p8 (poly8x8_t a)Reverse bit order

Description

A64 Instruction

RBIT Vd.8B,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

poly8x16_t vrbitq_p8 (poly8x16_t a)Reverse bit order

Description

A64 Instruction

RBIT Vd.16B,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;
bits(esize) rev;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    for i = 0 to esize-1
        rev<esize-1-i> = element<i>;
    Elem[result, e, esize] = rev;

V[d] = result;

Supported architectures

A64

int8x8_t vcreate_s8 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vcreate_s16 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vcreate_s32 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vcreate_s64 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vcreate_u8 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vcreate_u16 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vcreate_u32 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vcreate_u64 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vcreate_p64 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float16x4_t vcreate_f16 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vcreate_f32 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vcreate_p8 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vcreate_p16 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vcreate_f64 (uint64_t a)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[0],Xn

Argument Preparation

a → Xn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vdup_n_s8 (int8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vdupq_n_s8 (int8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vdup_n_s16 (int16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vdupq_n_s16 (int16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vdup_n_s32 (int32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vdupq_n_s32 (int32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vdup_n_s64 (int64_t value)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Dd.D[0],xn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vdupq_n_s64 (int64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vdup_n_u8 (uint8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vdupq_n_u8 (uint8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vdup_n_u16 (uint16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vdupq_n_u16 (uint16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vdup_n_u32 (uint32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vdupq_n_u32 (uint32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vdup_n_u64 (uint64_t value)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Dd.D[0],xn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vdupq_n_u64 (uint64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vdup_n_p64 (poly64_t value)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Dd.D[0],xn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vdupq_n_p64 (poly64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float32x2_t vdup_n_f32 (float32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vdupq_n_f32 (float32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vdup_n_p8 (poly8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vdupq_n_p8 (poly8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vdup_n_p16 (poly16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vdupq_n_p16 (poly16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vdup_n_f64 (float64_t value)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Dd.D[0],xn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vdupq_n_f64 (float64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vmov_n_s8 (int8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vmovq_n_s8 (int8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vmov_n_s16 (int16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vmovq_n_s16 (int16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vmov_n_s32 (int32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vmovq_n_s32 (int32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vmov_n_s64 (int64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,rn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vmovq_n_s64 (int64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vmov_n_u8 (uint8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vmovq_n_u8 (uint8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vmov_n_u16 (uint16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vmovq_n_u16 (uint16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vmov_n_u32 (uint32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vmovq_n_u32 (uint32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vmov_n_u64 (uint64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,rn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vmovq_n_u64 (uint64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vmov_n_f32 (float32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,rn

Argument Preparation

value → rn

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vmovq_n_f32 (float32_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,rn

Argument Preparation

value → rn

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vmov_n_p8 (poly8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,rn

Argument Preparation

value → rn

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vmovq_n_p8 (poly8_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,rn

Argument Preparation

value → rn

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vmov_n_p16 (poly16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,rn

Argument Preparation

value → rn

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vmovq_n_p16 (poly16_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,rn

Argument Preparation

value → rn

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vmov_n_f64 (float64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,rn

Argument Preparation

value → rn

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vmovq_n_f64 (float64_t value)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,rn

Argument Preparation

value → rn

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vdup_lane_s8 (int8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vdupq_lane_s8 (int8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vdup_lane_s16 (int16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vdupq_lane_s16 (int16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vdup_lane_s32 (int32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vdupq_lane_s32 (int32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vdup_lane_s64 (int64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vdupq_lane_s64 (int64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vdup_lane_u8 (uint8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vdupq_lane_u8 (uint8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vdup_lane_u16 (uint16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vdupq_lane_u16 (uint16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vdup_lane_u32 (uint32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vdupq_lane_u32 (uint32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vdup_lane_u64 (uint64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vdupq_lane_u64 (uint64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vdup_lane_p64 (poly64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vdupq_lane_p64 (poly64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float32x2_t vdup_lane_f32 (float32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vdupq_lane_f32 (float32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vdup_lane_p8 (poly8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vdupq_lane_p8 (poly8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vdup_lane_p16 (poly16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vdupq_lane_p16 (poly16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vdup_lane_f64 (float64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vdupq_lane_f64 (float64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vdup_laneq_s8 (int8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x16_t vdupq_laneq_s8 (int8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int16x4_t vdup_laneq_s16 (int16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int16x8_t vdupq_laneq_s16 (int16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int32x2_t vdup_laneq_s32 (int32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int32x4_t vdupq_laneq_s32 (int32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64x1_t vdup_laneq_s64 (int64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64x2_t vdupq_laneq_s64 (int64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x8_t vdup_laneq_u8 (uint8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x16_t vdupq_laneq_u8 (uint8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x4_t vdup_laneq_u16 (uint16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint16x8_t vdupq_laneq_u16 (uint16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x2_t vdup_laneq_u32 (uint32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint32x4_t vdupq_laneq_u32 (uint32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x1_t vdup_laneq_u64 (uint64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64x2_t vdupq_laneq_u64 (uint64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly64x1_t vdup_laneq_p64 (poly64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

poly64x2_t vdupq_laneq_p64 (poly64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float32x2_t vdup_laneq_f32 (float32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float32x4_t vdupq_laneq_f32 (float32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4S,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x8_t vdup_laneq_p8 (poly8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly8x16_t vdupq_laneq_p8 (poly8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.16B,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x4_t vdup_laneq_p16 (poly16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.4H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly16x8_t vdupq_laneq_p16 (poly16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.8H,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x1_t vdup_laneq_f64 (float64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64x2_t vdupq_laneq_f64 (float64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.2D,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x16_t vcombine_s8 (int8x8_t low, int8x8_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.8B 

high → Vm.8B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vcombine_s16 (int16x4_t low, int16x4_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.4H 

high → Vm.4H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vcombine_s32 (int32x2_t low, int32x2_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.2S 

high → Vm.2S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vcombine_s64 (int64x1_t low, int64x1_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.1D 

high → Vm.1D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vcombine_u8 (uint8x8_t low, uint8x8_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.8B 

high → Vm.8B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vcombine_u16 (uint16x4_t low, uint16x4_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.4H 

high → Vm.4H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vcombine_u32 (uint32x2_t low, uint32x2_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.2S 

high → Vm.2S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vcombine_u64 (uint64x1_t low, uint64x1_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.1D 

high → Vm.1D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x2_t vcombine_p64 (poly64x1_t low, poly64x1_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.1D 

high → Vm.1D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float16x8_t vcombine_f16 (float16x4_t low, float16x4_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.4H 

high → Vm.4H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vcombine_f32 (float32x2_t low, float32x2_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.2S 

high → Vm.2S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vcombine_p8 (poly8x8_t low, poly8x8_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.8B 

high → Vm.8B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vcombine_p16 (poly16x4_t low, poly16x4_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.4H 

high → Vm.4H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x2_t vcombine_f64 (float64x1_t low, float64x1_t high)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

DUP Vd.1D,Vn.D[0]
INS Vd.D[1],Vm.D[0]

Argument Preparation

low → Vn.1D 

high → Vm.1D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vget_high_s8 (int8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vget_high_s16 (int16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vget_high_s32 (int32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vget_high_s64 (int64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vget_high_u8 (uint8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vget_high_u16 (uint16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vget_high_u32 (uint32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vget_high_u64 (uint64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vget_high_p64 (poly64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float16x4_t vget_high_f16 (float16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float32x2_t vget_high_f32 (float32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vget_high_p8 (poly8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vget_high_p16 (poly16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vget_high_f64 (float64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[1]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vget_low_s8 (int8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vget_low_s16 (int16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vget_low_s32 (int32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vget_low_s64 (int64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vget_low_u8 (uint8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vget_low_u16 (uint16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vget_low_u32 (uint32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vget_low_u64 (uint64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vget_low_p64 (poly64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A32/A64

float16x4_t vget_low_f16 (float16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float32x2_t vget_low_f32 (float32x4_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.4S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vget_low_p8 (poly8x16_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.16B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vget_low_p16 (poly16x8_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.8H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vget_low_f64 (float64x2_t a)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Vd.1D,Vn.D[0]

Argument Preparation

a → Vn.2D

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8_t vdupb_lane_s8 (int8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int16_t vduph_lane_s16 (int16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int32_t vdups_lane_s32 (int32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64_t vdupd_lane_s64 (int64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8_t vdupb_lane_u8 (uint8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint16_t vduph_lane_u16 (uint16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint32_t vdups_lane_u32 (uint32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64_t vdupd_lane_u64 (uint64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float32_t vdups_lane_f32 (float32x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64_t vdupd_lane_f64 (float64x1_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly8_t vdupb_lane_p8 (poly8x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.8B 

0 << lane << 7

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly16_t vduph_lane_p16 (poly16x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8_t vdupb_laneq_s8 (int8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int16_t vduph_laneq_s16 (int16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int32_t vdups_laneq_s32 (int32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int64_t vdupd_laneq_s64 (int64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8_t vdupb_laneq_u8 (uint8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint16_t vduph_laneq_u16 (uint16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint32_t vdups_laneq_u32 (uint32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint64_t vdupd_laneq_u64 (uint64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float32_t vdups_laneq_f32 (float32x4_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

vec → Vn.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

float64_t vdupd_laneq_f64 (float64x2_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

vec → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly8_t vdupb_laneq_p8 (poly8x16_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Bd,Vn.B[lane]

Argument Preparation

vec → Vn.16B 

0 << lane << 15

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

poly16_t vduph_laneq_p16 (poly16x8_t vec, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

vec → Vn.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

int8x8_t vld1_s8 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

Load one single-element structure to one lane of one register. This instruction loads a single-element structure from memory and writes the result to the specified lane of the SIMD&FP register without affecting the other bits of the register.

A64 Instruction

LD1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16_t vld1q_s8 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4_t vld1_s16 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8_t vld1q_s16 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2_t vld1_s32 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4_t vld1q_s32 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1_t vld1_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x2_t vld1q_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8_t vld1_u8 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16_t vld1q_u8 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4_t vld1_u16 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8_t vld1q_u16 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2_t vld1_u32 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4_t vld1q_u32 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1_t vld1_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2_t vld1q_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1_t vld1_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

poly64x2_t vld1q_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float16x4_t vld1_f16 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8_t vld1q_f16 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2_t vld1_f32 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4_t vld1q_f32 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8_t vld1_p8 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16_t vld1q_p8 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4_t vld1_p16 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8_t vld1q_p16 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float64x1_t vld1_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2_t vld1q_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8_t vld1_lane_s8 (int8_t const * ptr, int8x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8B 

0 << lane << 7

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16_t vld1q_lane_s8 (int8_t const * ptr, int8x16_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.16B 

0 << lane << 15

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4_t vld1_lane_s16 (int16_t const * ptr, int16x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4H 

0 << lane << 3

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8_t vld1q_lane_s16 (int16_t const * ptr, int16x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8H 

0 << lane << 7

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2_t vld1_lane_s32 (int32_t const * ptr, int32x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2S 

0 << lane << 1

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4_t vld1q_lane_s32 (int32_t const * ptr, int32x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4S 

0 << lane << 3

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1_t vld1_lane_s64 (int64_t const * ptr, int64x1_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.1D 

0 << lane << 0

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x2_t vld1q_lane_s64 (int64_t const * ptr, int64x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2D 

0 << lane << 1

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8_t vld1_lane_u8 (uint8_t const * ptr, uint8x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.B}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8B 

0 << lane << 7

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16_t vld1q_lane_u8 (uint8_t const * ptr, uint8x16_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.B}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.16B 

0 << lane << 15

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4_t vld1_lane_u16 (uint16_t const * ptr, uint16x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4H 

0 << lane << 3

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8_t vld1q_lane_u16 (uint16_t const * ptr, uint16x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8H 

0 << lane << 7

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2_t vld1_lane_u32 (uint32_t const * ptr, uint32x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2S 

0 << lane << 1

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4_t vld1q_lane_u32 (uint32_t const * ptr, uint32x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4S 

0 << lane << 3

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1_t vld1_lane_u64 (uint64_t const * ptr, uint64x1_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.1D 

0 << lane << 0

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2_t vld1q_lane_u64 (uint64_t const * ptr, uint64x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2D 

0 << lane << 1

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1_t vld1_lane_p64 (poly64_t const * ptr, poly64x1_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.1D 

0 << lane << 0

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

poly64x2_t vld1q_lane_p64 (poly64_t const * ptr, poly64x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2D 

0 << lane << 1

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float16x4_t vld1_lane_f16 (float16_t const * ptr, float16x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4H 

0 << lane << 3

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8_t vld1q_lane_f16 (float16_t const * ptr, float16x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8H 

0 << lane << 7

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2_t vld1_lane_f32 (float32_t const * ptr, float32x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2S 

0 << lane << 1

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4_t vld1q_lane_f32 (float32_t const * ptr, float32x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.S}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4S 

0 << lane << 3

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8_t vld1_lane_p8 (poly8_t const * ptr, poly8x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.B}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8B 

0 << lane << 7

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16_t vld1q_lane_p8 (poly8_t const * ptr, poly8x16_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.B}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.16B 

0 << lane << 15

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4_t vld1_lane_p16 (poly16_t const * ptr, poly16x4_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.4H 

0 << lane << 3

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8_t vld1q_lane_p16 (poly16_t const * ptr, poly16x8_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.H}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.8H 

0 << lane << 7

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float64x1_t vld1_lane_f64 (float64_t const * ptr, float64x1_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.1D 

0 << lane << 0

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2_t vld1q_lane_f64 (float64_t const * ptr, float64x2_t src, const int lane)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.D}[lane],[Xn]

Argument Preparation

ptr → Xn 

src → Vt.2D 

0 << lane << 1

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8_t vld1_dup_s8 (int8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

Load one single-element structure and Replicate to all lanes (of one register). This instruction loads a single-element structure from memory and replicates the structure to all the lanes of the SIMD&FP register.

A64 Instruction

LD1R {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16_t vld1q_dup_s8 (int8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4_t vld1_dup_s16 (int16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8_t vld1q_dup_s16 (int16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2_t vld1_dup_s32 (int32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4_t vld1q_dup_s32 (int32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1_t vld1_dup_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x2_t vld1q_dup_s64 (int64_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8_t vld1_dup_u8 (uint8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16_t vld1q_dup_u8 (uint8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4_t vld1_dup_u16 (uint16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8_t vld1q_dup_u16 (uint16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2_t vld1_dup_u32 (uint32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4_t vld1q_dup_u32 (uint32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1_t vld1_dup_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2_t vld1q_dup_u64 (uint64_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1_t vld1_dup_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

poly64x2_t vld1q_dup_p64 (poly64_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float16x4_t vld1_dup_f16 (float16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8_t vld1q_dup_f16 (float16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2_t vld1_dup_f32 (float32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4_t vld1q_dup_f32 (float32_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4S → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8_t vld1_dup_p8 (poly8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16_t vld1q_dup_p8 (poly8_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.16B → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4_t vld1_dup_p16 (poly16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.4H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8_t vld1q_dup_p16 (poly16_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.8H → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float64x1_t vld1_dup_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.1D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2_t vld1q_dup_f64 (float64_t const * ptr)Load one single-element structure and replicate to all lanes (of one register)

Description

A64 Instruction

LD1R {Vt.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt.2D → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1_s8 (int8_t * ptr, int8x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s8 (int8_t * ptr, int8x16_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s16 (int16_t * ptr, int16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s16 (int16_t * ptr, int16x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s32 (int32_t * ptr, int32x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s32 (int32_t * ptr, int32x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s64 (int64_t * ptr, int64x1_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s64 (int64_t * ptr, int64x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u8 (uint8_t * ptr, uint8x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u8 (uint8_t * ptr, uint8x16_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u16 (uint16_t * ptr, uint16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u16 (uint16_t * ptr, uint16x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u32 (uint32_t * ptr, uint32x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u32 (uint32_t * ptr, uint32x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u64 (uint64_t * ptr, uint64x1_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u64 (uint64_t * ptr, uint64x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p64 (poly64_t * ptr, poly64x1_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1q_p64 (poly64_t * ptr, poly64x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1_f16 (float16_t * ptr, float16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f16 (float16_t * ptr, float16x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f32 (float32_t * ptr, float32x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f32 (float32_t * ptr, float32x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p8 (poly8_t * ptr, poly8x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p8 (poly8_t * ptr, poly8x16_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p16 (poly16_t * ptr, poly16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p16 (poly16_t * ptr, poly16x8_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f64 (float64_t * ptr, float64x1_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1q_f64 (float64_t * ptr, float64x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D},[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1_lane_s8 (int8_t * ptr, int8x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_s8 (int8_t * ptr, int8x16_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_s16 (int16_t * ptr, int16x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_s16 (int16_t * ptr, int16x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_s32 (int32_t * ptr, int32x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_s32 (int32_t * ptr, int32x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_s64 (int64_t * ptr, int64x1_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_s64 (int64_t * ptr, int64x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_u8 (uint8_t * ptr, uint8x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_u8 (uint8_t * ptr, uint8x16_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_u16 (uint16_t * ptr, uint16x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_u16 (uint16_t * ptr, uint16x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_u32 (uint32_t * ptr, uint32x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_u32 (uint32_t * ptr, uint32x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_u64 (uint64_t * ptr, uint64x1_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_u64 (uint64_t * ptr, uint64x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_p64 (poly64_t * ptr, poly64x1_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1q_lane_p64 (poly64_t * ptr, poly64x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1_lane_f16 (float16_t * ptr, float16x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_f16 (float16_t * ptr, float16x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_f32 (float32_t * ptr, float32x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_f32 (float32_t * ptr, float32x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_p8 (poly8_t * ptr, poly8x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_p8 (poly8_t * ptr, poly8x16_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_p16 (poly16_t * ptr, poly16x4_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_lane_p16 (poly16_t * ptr, poly16x8_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_lane_f64 (float64_t * ptr, float64x1_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1q_lane_f64 (float64_t * ptr, float64x2_t val, const int lane)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x2_t vld2_s8 (int8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

Load single 2-element structure to one lane of two registers. This instruction loads a 2-element structure from memory and writes the result to the corresponding elements of the two SIMD&FP registers without affecting the other bits of the registers.

A64 Instruction

LD2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x2_t vld2q_s8 (int8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x2_t vld2_s16 (int16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x2_t vld2q_s16 (int16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x2_t vld2_s32 (int32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x2_t vld2q_s32 (int32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x2_t vld2_u8 (uint8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x2_t vld2q_u8 (uint8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x2_t vld2_u16 (uint16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x2_t vld2q_u16 (uint16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x2_t vld2_u32 (uint32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x2_t vld2q_u32 (uint32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x2_t vld2_f16 (float16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x2_t vld2q_f16 (float16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x2_t vld2_f32 (float32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x2_t vld2q_f32 (float32_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x2_t vld2_p8 (poly8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x2_t vld2q_p8 (poly8_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x2_t vld2_p16 (poly16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x2_t vld2q_p16 (poly16_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x2_t vld2_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x2_t vld2_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x2_t vld2_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x2_t vld2q_s64 (int64_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x2_t vld2q_u64 (uint64_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x2_t vld2q_p64 (poly64_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x2_t vld2_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x2_t vld2q_f64 (float64_t const * ptr)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x3_t vld3_s8 (int8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

Load single 3-element structure to one lane of three registers). This instruction loads a 3-element structure from memory and writes the result to the corresponding elements of the three SIMD&FP registers without affecting the other bits of the registers.

A64 Instruction

LD3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x3_t vld3q_s8 (int8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x3_t vld3_s16 (int16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x3_t vld3q_s16 (int16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x3_t vld3_s32 (int32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x3_t vld3q_s32 (int32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x3_t vld3_u8 (uint8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x3_t vld3q_u8 (uint8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x3_t vld3_u16 (uint16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x3_t vld3q_u16 (uint16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x3_t vld3_u32 (uint32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x3_t vld3q_u32 (uint32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x3_t vld3_f16 (float16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x3_t vld3q_f16 (float16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x3_t vld3_f32 (float32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x3_t vld3q_f32 (float32_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x3_t vld3_p8 (poly8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x3_t vld3q_p8 (poly8_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x3_t vld3_p16 (poly16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x3_t vld3q_p16 (poly16_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x3_t vld3_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x3_t vld3_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x3_t vld3_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x3_t vld3q_s64 (int64_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x3_t vld3q_u64 (uint64_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x3_t vld3q_p64 (poly64_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x3_t vld3_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x3_t vld3q_f64 (float64_t const * ptr)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x4_t vld4_s8 (int8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

Load single 4-element structure to one lane of four registers. This instruction loads a 4-element structure from memory and writes the result to the corresponding elements of the four SIMD&FP registers without affecting the other bits of the registers.

A64 Instruction

LD4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x4_t vld4q_s8 (int8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x4_t vld4_s16 (int16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x4_t vld4q_s16 (int16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x4_t vld4_s32 (int32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x4_t vld4q_s32 (int32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x4_t vld4_u8 (uint8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x4_t vld4q_u8 (uint8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x4_t vld4_u16 (uint16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x4_t vld4q_u16 (uint16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x4_t vld4_u32 (uint32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x4_t vld4q_u32 (uint32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x4_t vld4_f16 (float16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x4_t vld4q_f16 (float16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x4_t vld4_f32 (float32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x4_t vld4q_f32 (float32_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x4_t vld4_p8 (poly8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x4_t vld4q_p8 (poly8_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x4_t vld4_p16 (poly16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x4_t vld4q_p16 (poly16_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x4_t vld4_s64 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x4_t vld4_u64 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x4_t vld4_p64 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x4_t vld4q_s64 (int64_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x4_t vld4q_u64 (uint64_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x4_t vld4q_p64 (poly64_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x4_t vld4_f64 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x4_t vld4q_f64 (float64_t const * ptr)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x2_t vld2_dup_s8 (int8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

Load single 2-element structure and Replicate to all lanes of two registers. This instruction loads a 2-element structure from memory and replicates the structure to all the lanes of the two SIMD&FP registers.

A64 Instruction

LD2R {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x2_t vld2q_dup_s8 (int8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x2_t vld2_dup_s16 (int16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x2_t vld2q_dup_s16 (int16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x2_t vld2_dup_s32 (int32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x2_t vld2q_dup_s32 (int32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x2_t vld2_dup_u8 (uint8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x2_t vld2q_dup_u8 (uint8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x2_t vld2_dup_u16 (uint16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x2_t vld2q_dup_u16 (uint16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x2_t vld2_dup_u32 (uint32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x2_t vld2q_dup_u32 (uint32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x2_t vld2_dup_f16 (float16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x2_t vld2q_dup_f16 (float16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x2_t vld2_dup_f32 (float32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x2_t vld2q_dup_f32 (float32_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x2_t vld2_dup_p8 (poly8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x2_t vld2q_dup_p8 (poly8_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x2_t vld2_dup_p16 (poly16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x2_t vld2q_dup_p16 (poly16_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x2_t vld2_dup_s64 (int64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x2_t vld2_dup_u64 (uint64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x2_t vld2_dup_p64 (poly64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x2_t vld2q_dup_s64 (int64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x2_t vld2q_dup_u64 (uint64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x2_t vld2q_dup_p64 (poly64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x2_t vld2_dup_f64 (float64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x2_t vld2q_dup_f64 (float64_t const * ptr)Load single 2-element structure and replicate to all lanes of two registers

Description

A64 Instruction

LD2R {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x3_t vld3_dup_s8 (int8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

Load single 3-element structure and Replicate to all lanes of three registers. This instruction loads a 3-element structure from memory and replicates the structure to all the lanes of the three SIMD&FP registers.

A64 Instruction

LD3R {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x3_t vld3q_dup_s8 (int8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x3_t vld3_dup_s16 (int16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x3_t vld3q_dup_s16 (int16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x3_t vld3_dup_s32 (int32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x3_t vld3q_dup_s32 (int32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x3_t vld3_dup_u8 (uint8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x3_t vld3q_dup_u8 (uint8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x3_t vld3_dup_u16 (uint16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x3_t vld3q_dup_u16 (uint16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x3_t vld3_dup_u32 (uint32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x3_t vld3q_dup_u32 (uint32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x3_t vld3_dup_f16 (float16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x3_t vld3q_dup_f16 (float16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x3_t vld3_dup_f32 (float32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x3_t vld3q_dup_f32 (float32_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x3_t vld3_dup_p8 (poly8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x3_t vld3q_dup_p8 (poly8_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x3_t vld3_dup_p16 (poly16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x3_t vld3q_dup_p16 (poly16_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x3_t vld3_dup_s64 (int64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x3_t vld3_dup_u64 (uint64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x3_t vld3_dup_p64 (poly64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x3_t vld3q_dup_s64 (int64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x3_t vld3q_dup_u64 (uint64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x3_t vld3q_dup_p64 (poly64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x3_t vld3_dup_f64 (float64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x3_t vld3q_dup_f64 (float64_t const * ptr)Load single 3-element structure and replicate to all lanes of three registers

Description

A64 Instruction

LD3R {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x4_t vld4_dup_s8 (int8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

Load single 4-element structure and Replicate to all lanes of four registers. This instruction loads a 4-element structure from memory and replicates the structure to all the lanes of the four SIMD&FP registers.

A64 Instruction

LD4R {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x4_t vld4q_dup_s8 (int8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x4_t vld4_dup_s16 (int16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x4_t vld4q_dup_s16 (int16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x4_t vld4_dup_s32 (int32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x4_t vld4q_dup_s32 (int32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x4_t vld4_dup_u8 (uint8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x4_t vld4q_dup_u8 (uint8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x4_t vld4_dup_u16 (uint16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x4_t vld4q_dup_u16 (uint16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x4_t vld4_dup_u32 (uint32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x4_t vld4q_dup_u32 (uint32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x4_t vld4_dup_f16 (float16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x4_t vld4q_dup_f16 (float16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x4_t vld4_dup_f32 (float32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x4_t vld4q_dup_f32 (float32_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x4_t vld4_dup_p8 (poly8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x4_t vld4q_dup_p8 (poly8_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x4_t vld4_dup_p16 (poly16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x4_t vld4q_dup_p16 (poly16_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x4_t vld4_dup_s64 (int64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x4_t vld4_dup_u64 (uint64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x4_t vld4_dup_p64 (poly64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x4_t vld4q_dup_s64 (int64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x4_t vld4q_dup_u64 (uint64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x4_t vld4q_dup_p64 (poly64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x4_t vld4_dup_f64 (float64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x4_t vld4q_dup_f64 (float64_t const * ptr)Load single 4-element structure and replicate to all lanes of four registers

Description

A64 Instruction

LD4R {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_s8 (int8_t * ptr, int8x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_s8 (int8_t * ptr, int8x16x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_s16 (int16_t * ptr, int16x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_s16 (int16_t * ptr, int16x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_s32 (int32_t * ptr, int32x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_s32 (int32_t * ptr, int32x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_u8 (uint8_t * ptr, uint8x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_u8 (uint8_t * ptr, uint8x16x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_u16 (uint16_t * ptr, uint16x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_u16 (uint16_t * ptr, uint16x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_u32 (uint32_t * ptr, uint32x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_u32 (uint32_t * ptr, uint32x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_f16 (float16_t * ptr, float16x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_f16 (float16_t * ptr, float16x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_f32 (float32_t * ptr, float32x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_f32 (float32_t * ptr, float32x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_p8 (poly8_t * ptr, poly8x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_p8 (poly8_t * ptr, poly8x16x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_p16 (poly16_t * ptr, poly16x4x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_p16 (poly16_t * ptr, poly16x8x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_s64 (int64_t * ptr, int64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_u64 (uint64_t * ptr, uint64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_p64 (poly64_t * ptr, poly64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst2q_s64 (int64_t * ptr, int64x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_u64 (uint64_t * ptr, uint64x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_p64 (poly64_t * ptr, poly64x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_f64 (float64_t * ptr, float64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_f64 (float64_t * ptr, float64x2x2_t val)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_s8 (int8_t * ptr, int8x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_s8 (int8_t * ptr, int8x16x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_s16 (int16_t * ptr, int16x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_s16 (int16_t * ptr, int16x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_s32 (int32_t * ptr, int32x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_s32 (int32_t * ptr, int32x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_u8 (uint8_t * ptr, uint8x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_u8 (uint8_t * ptr, uint8x16x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_u16 (uint16_t * ptr, uint16x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_u16 (uint16_t * ptr, uint16x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_u32 (uint32_t * ptr, uint32x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_u32 (uint32_t * ptr, uint32x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_f16 (float16_t * ptr, float16x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_f16 (float16_t * ptr, float16x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_f32 (float32_t * ptr, float32x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_f32 (float32_t * ptr, float32x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_p8 (poly8_t * ptr, poly8x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_p8 (poly8_t * ptr, poly8x16x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_p16 (poly16_t * ptr, poly16x4x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_p16 (poly16_t * ptr, poly16x8x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_s64 (int64_t * ptr, int64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_u64 (uint64_t * ptr, uint64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_p64 (poly64_t * ptr, poly64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst3q_s64 (int64_t * ptr, int64x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_u64 (uint64_t * ptr, uint64x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_p64 (poly64_t * ptr, poly64x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_f64 (float64_t * ptr, float64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_f64 (float64_t * ptr, float64x2x3_t val)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_s8 (int8_t * ptr, int8x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_s8 (int8_t * ptr, int8x16x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_s16 (int16_t * ptr, int16x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_s16 (int16_t * ptr, int16x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_s32 (int32_t * ptr, int32x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_s32 (int32_t * ptr, int32x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_u8 (uint8_t * ptr, uint8x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_u8 (uint8_t * ptr, uint8x16x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_u16 (uint16_t * ptr, uint16x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_u16 (uint16_t * ptr, uint16x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_u32 (uint32_t * ptr, uint32x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_u32 (uint32_t * ptr, uint32x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_f16 (float16_t * ptr, float16x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_f16 (float16_t * ptr, float16x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_f32 (float32_t * ptr, float32x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_f32 (float32_t * ptr, float32x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_p8 (poly8_t * ptr, poly8x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_p8 (poly8_t * ptr, poly8x16x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_p16 (poly16_t * ptr, poly16x4x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_p16 (poly16_t * ptr, poly16x8x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_s64 (int64_t * ptr, int64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_u64 (uint64_t * ptr, uint64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_p64 (poly64_t * ptr, poly64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst4q_s64 (int64_t * ptr, int64x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_u64 (uint64_t * ptr, uint64x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_p64 (poly64_t * ptr, poly64x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_f64 (float64_t * ptr, float64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_f64 (float64_t * ptr, float64x2x4_t val)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int16x4x2_t vld2_lane_s16 (int16_t const * ptr, int16x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x2_t vld2q_lane_s16 (int16_t const * ptr, int16x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x2_t vld2_lane_s32 (int32_t const * ptr, int32x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x2_t vld2q_lane_s32 (int32_t const * ptr, int32x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x2_t vld2_lane_u16 (uint16_t const * ptr, uint16x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x2_t vld2q_lane_u16 (uint16_t const * ptr, uint16x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x2_t vld2_lane_u32 (uint32_t const * ptr, uint32x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x2_t vld2q_lane_u32 (uint32_t const * ptr, uint32x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x2_t vld2_lane_f16 (float16_t const * ptr, float16x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x2_t vld2q_lane_f16 (float16_t const * ptr, float16x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x2_t vld2_lane_f32 (float32_t const * ptr, float32x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x2_t vld2q_lane_f32 (float32_t const * ptr, float32x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x2_t vld2_lane_p16 (poly16_t const * ptr, poly16x4x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x2_t vld2q_lane_p16 (poly16_t const * ptr, poly16x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x8x2_t vld2_lane_s8 (int8_t const * ptr, int8x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x2_t vld2_lane_u8 (uint8_t const * ptr, uint8x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x2_t vld2_lane_p8 (poly8_t const * ptr, poly8x8x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x2_t vld2q_lane_s8 (int8_t const * ptr, int8x16x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint8x16x2_t vld2q_lane_u8 (uint8_t const * ptr, uint8x16x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly8x16x2_t vld2q_lane_p8 (poly8_t const * ptr, poly8x16x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x1x2_t vld2_lane_s64 (int64_t const * ptr, int64x1x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

ptr → Xn
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x2x2_t vld2q_lane_s64 (int64_t const * ptr, int64x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

ptr → Xn
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x1x2_t vld2_lane_u64 (uint64_t const * ptr, uint64x1x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x2_t vld2q_lane_u64 (uint64_t const * ptr, uint64x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x1x2_t vld2_lane_p64 (poly64_t const * ptr, poly64x1x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x2_t vld2q_lane_p64 (poly64_t const * ptr, poly64x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x2_t vld2_lane_f64 (float64_t const * ptr, float64x1x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x2_t vld2q_lane_f64 (float64_t const * ptr, float64x2x2_t src, const int lane)Load single 2-element structure to one lane of two registers

Description

A64 Instruction

LD2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int16x4x3_t vld3_lane_s16 (int16_t const * ptr, int16x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x3_t vld3q_lane_s16 (int16_t const * ptr, int16x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x3_t vld3_lane_s32 (int32_t const * ptr, int32x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x3_t vld3q_lane_s32 (int32_t const * ptr, int32x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x3_t vld3_lane_u16 (uint16_t const * ptr, uint16x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x3_t vld3q_lane_u16 (uint16_t const * ptr, uint16x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x3_t vld3_lane_u32 (uint32_t const * ptr, uint32x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x3_t vld3q_lane_u32 (uint32_t const * ptr, uint32x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x3_t vld3_lane_f16 (float16_t const * ptr, float16x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x3_t vld3q_lane_f16 (float16_t const * ptr, float16x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x3_t vld3_lane_f32 (float32_t const * ptr, float32x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x3_t vld3q_lane_f32 (float32_t const * ptr, float32x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x3_t vld3_lane_p16 (poly16_t const * ptr, poly16x4x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x3_t vld3q_lane_p16 (poly16_t const * ptr, poly16x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x8x3_t vld3_lane_s8 (int8_t const * ptr, int8x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x3_t vld3_lane_u8 (uint8_t const * ptr, uint8x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x3_t vld3_lane_p8 (poly8_t const * ptr, poly8x8x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x3_t vld3q_lane_s8 (int8_t const * ptr, int8x16x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint8x16x3_t vld3q_lane_u8 (uint8_t const * ptr, uint8x16x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly8x16x3_t vld3q_lane_p8 (poly8_t const * ptr, poly8x16x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x1x3_t vld3_lane_s64 (int64_t const * ptr, int64x1x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x2x3_t vld3q_lane_s64 (int64_t const * ptr, int64x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x1x3_t vld3_lane_u64 (uint64_t const * ptr, uint64x1x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x3_t vld3q_lane_u64 (uint64_t const * ptr, uint64x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x1x3_t vld3_lane_p64 (poly64_t const * ptr, poly64x1x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x3_t vld3q_lane_p64 (poly64_t const * ptr, poly64x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x3_t vld3_lane_f64 (float64_t const * ptr, float64x1x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x3_t vld3q_lane_f64 (float64_t const * ptr, float64x2x3_t src, const int lane)Load single 3-element structure to one lane of three registers

Description

A64 Instruction

LD3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int16x4x4_t vld4_lane_s16 (int16_t const * ptr, int16x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4H 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x4_t vld4q_lane_s16 (int16_t const * ptr, int16x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8H 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x4_t vld4_lane_s32 (int32_t const * ptr, int32x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2S 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x4_t vld4q_lane_s32 (int32_t const * ptr, int32x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4S 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x4_t vld4_lane_u16 (uint16_t const * ptr, uint16x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4H 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x4_t vld4q_lane_u16 (uint16_t const * ptr, uint16x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8H 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x4_t vld4_lane_u32 (uint32_t const * ptr, uint32x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2S 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x4_t vld4q_lane_u32 (uint32_t const * ptr, uint32x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4S 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x4_t vld4_lane_f16 (float16_t const * ptr, float16x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4H 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x4_t vld4q_lane_f16 (float16_t const * ptr, float16x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8H 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x4_t vld4_lane_f32 (float32_t const * ptr, float32x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2S 

src.val[2] → Vt3.2S 

src.val[1] → Vt2.2S 

src.val[0] → Vt.2S 

0 << lane << 1

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x4_t vld4q_lane_f32 (float32_t const * ptr, float32x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4S 

src.val[2] → Vt3.4S 

src.val[1] → Vt2.4S 

src.val[0] → Vt.4S 

0 << lane << 3

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x4_t vld4_lane_p16 (poly16_t const * ptr, poly16x4x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.4H 

src.val[2] → Vt3.4H 

src.val[1] → Vt2.4H 

src.val[0] → Vt.4H 

0 << lane << 3

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x4_t vld4q_lane_p16 (poly16_t const * ptr, poly16x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8H 

src.val[2] → Vt3.8H 

src.val[1] → Vt2.8H 

src.val[0] → Vt.8H 

0 << lane << 7

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x8x4_t vld4_lane_s8 (int8_t const * ptr, int8x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8B 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x4_t vld4_lane_u8 (uint8_t const * ptr, uint8x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8B 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x4_t vld4_lane_p8 (poly8_t const * ptr, poly8x8x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.8B 

src.val[2] → Vt3.8B 

src.val[1] → Vt2.8B 

src.val[0] → Vt.8B 

0 << lane << 7

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x4_t vld4q_lane_s8 (int8_t const * ptr, int8x16x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.16B 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint8x16x4_t vld4q_lane_u8 (uint8_t const * ptr, uint8x16x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.16B 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly8x16x4_t vld4q_lane_p8 (poly8_t const * ptr, poly8x16x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.16B 

src.val[2] → Vt3.16B 

src.val[1] → Vt2.16B 

src.val[0] → Vt.16B 

0 << lane << 15

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x1x4_t vld4_lane_s64 (int64_t const * ptr, int64x1x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.1D 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int64x2x4_t vld4q_lane_s64 (int64_t const * ptr, int64x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2D 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x1x4_t vld4_lane_u64 (uint64_t const * ptr, uint64x1x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.1D 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

uint64x2x4_t vld4q_lane_u64 (uint64_t const * ptr, uint64x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2D 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x1x4_t vld4_lane_p64 (poly64_t const * ptr, poly64x1x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.1D 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

poly64x2x4_t vld4q_lane_p64 (poly64_t const * ptr, poly64x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2D 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x1x4_t vld4_lane_f64 (float64_t const * ptr, float64x1x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.1D 

src.val[2] → Vt3.1D 

src.val[1] → Vt2.1D 

src.val[0] → Vt.1D 

0 << lane << 0

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x4_t vld4q_lane_f64 (float64_t const * ptr, float64x2x4_t src, const int lane)Load single 4-element structure to one lane of four registers

Description

A64 Instruction

LD4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

src.val[3] → Vt4.2D 

src.val[2] → Vt3.2D 

src.val[1] → Vt2.2D 

src.val[0] → Vt.2D 

0 << lane << 1

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_lane_s8 (int8_t * ptr, int8x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_u8 (uint8_t * ptr, uint8x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_p8 (poly8_t * ptr, poly8x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_s8 (int8_t * ptr, int8x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_u8 (uint8_t * ptr, uint8x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_p8 (poly8_t * ptr, poly8x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_s8 (int8_t * ptr, int8x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_u8 (uint8_t * ptr, uint8x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_p8 (poly8_t * ptr, poly8x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_s16 (int16_t * ptr, int16x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_s16 (int16_t * ptr, int16x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_s32 (int32_t * ptr, int32x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_s32 (int32_t * ptr, int32x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_u16 (uint16_t * ptr, uint16x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_u16 (uint16_t * ptr, uint16x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_u32 (uint32_t * ptr, uint32x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_u32 (uint32_t * ptr, uint32x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_f16 (float16_t * ptr, float16x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_f16 (float16_t * ptr, float16x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_f32 (float32_t * ptr, float32x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_f32 (float32_t * ptr, float32x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.s - Vt2.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2_lane_p16 (poly16_t * ptr, poly16x4x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_p16 (poly16_t * ptr, poly16x8x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.h - Vt2.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst2q_lane_s8 (int8_t * ptr, int8x16x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_u8 (uint8_t * ptr, uint8x16x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_p8 (poly8_t * ptr, poly8x16x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.b - Vt2.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_lane_s64 (int64_t * ptr, int64x1x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_s64 (int64_t * ptr, int64x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_lane_u64 (uint64_t * ptr, uint64x1x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_u64 (uint64_t * ptr, uint64x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_lane_p64 (poly64_t * ptr, poly64x1x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_p64 (poly64_t * ptr, poly64x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2_lane_f64 (float64_t * ptr, float64x1x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst2q_lane_f64 (float64_t * ptr, float64x2x2_t val, const int lane)Store single 2-element structure from one lane of two registers

Description

Store single 2-element structure from one lane of two registers. This instruction stores a 2-element structure to memory from corresponding elements of two SIMD&FP registers.

A64 Instruction

ST2 {Vt.d - Vt2.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 2

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_lane_s16 (int16_t * ptr, int16x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_s16 (int16_t * ptr, int16x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_s32 (int32_t * ptr, int32x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_s32 (int32_t * ptr, int32x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_u16 (uint16_t * ptr, uint16x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_u16 (uint16_t * ptr, uint16x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_u32 (uint32_t * ptr, uint32x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_u32 (uint32_t * ptr, uint32x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_f16 (float16_t * ptr, float16x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_f16 (float16_t * ptr, float16x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_f32 (float32_t * ptr, float32x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_f32 (float32_t * ptr, float32x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.s - Vt3.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_p16 (poly16_t * ptr, poly16x4x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_p16 (poly16_t * ptr, poly16x8x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.h - Vt3.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_s8 (int8_t * ptr, int8x16x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_u8 (uint8_t * ptr, uint8x16x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3q_lane_p8 (poly8_t * ptr, poly8x16x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.b - Vt3.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst3_lane_s64 (int64_t * ptr, int64x1x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_lane_s64 (int64_t * ptr, int64x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_lane_u64 (uint64_t * ptr, uint64x1x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_lane_u64 (uint64_t * ptr, uint64x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_lane_p64 (poly64_t * ptr, poly64x1x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_lane_p64 (poly64_t * ptr, poly64x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3_lane_f64 (float64_t * ptr, float64x1x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst3q_lane_f64 (float64_t * ptr, float64x2x3_t val, const int lane)Store single 3-element structure from one lane of three registers

Description

Store single 3-element structure from one lane of three registers. This instruction stores a 3-element structure to memory from corresponding elements of three SIMD&FP registers.

A64 Instruction

ST3 {Vt.d - Vt3.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_lane_s16 (int16_t * ptr, int16x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_s16 (int16_t * ptr, int16x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_s32 (int32_t * ptr, int32x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_s32 (int32_t * ptr, int32x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_u16 (uint16_t * ptr, uint16x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_u16 (uint16_t * ptr, uint16x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_u32 (uint32_t * ptr, uint32x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_u32 (uint32_t * ptr, uint32x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_f16 (float16_t * ptr, float16x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_f16 (float16_t * ptr, float16x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_f32 (float32_t * ptr, float32x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_f32 (float32_t * ptr, float32x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.s - Vt4.s}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4_lane_p16 (poly16_t * ptr, poly16x4x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H 

0 << lane << 3

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_p16 (poly16_t * ptr, poly16x8x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.h - Vt4.h}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H 

0 << lane << 7

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst4q_lane_s8 (int8_t * ptr, int8x16x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_u8 (uint8_t * ptr, uint8x16x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_p8 (poly8_t * ptr, poly8x16x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.b - Vt4.b}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B 

0 << lane << 15

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_lane_s64 (int64_t * ptr, int64x1x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_s64 (int64_t * ptr, int64x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_lane_u64 (uint64_t * ptr, uint64x1x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_u64 (uint64_t * ptr, uint64x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_lane_p64 (poly64_t * ptr, poly64x1x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_p64 (poly64_t * ptr, poly64x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4_lane_f64 (float64_t * ptr, float64x1x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D 

0 << lane << 0

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst4q_lane_f64 (float64_t * ptr, float64x2x4_t val, const int lane)Store single 4-element structure from one lane of four registers

Description

Store single 4-element structure from one lane of four registers. This instruction stores a 4-element structure to memory from corresponding elements of four SIMD&FP registers.

A64 Instruction

ST4 {Vt.d - Vt4.d}[lane],[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D 

0 << lane << 1

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1_s8_x2 (int8_t * ptr, int8x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s8_x2 (int8_t * ptr, int8x16x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s16_x2 (int16_t * ptr, int16x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s16_x2 (int16_t * ptr, int16x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s32_x2 (int32_t * ptr, int32x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s32_x2 (int32_t * ptr, int32x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u8_x2 (uint8_t * ptr, uint8x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u8_x2 (uint8_t * ptr, uint8x16x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u16_x2 (uint16_t * ptr, uint16x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u16_x2 (uint16_t * ptr, uint16x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u32_x2 (uint32_t * ptr, uint32x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u32_x2 (uint32_t * ptr, uint32x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f16_x2 (float16_t * ptr, float16x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f16_x2 (float16_t * ptr, float16x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f32_x2 (float32_t * ptr, float32x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f32_x2 (float32_t * ptr, float32x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p8_x2 (poly8_t * ptr, poly8x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p8_x2 (poly8_t * ptr, poly8x16x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p16_x2 (poly16_t * ptr, poly16x4x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p16_x2 (poly16_t * ptr, poly16x8x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s64_x2 (int64_t * ptr, int64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u64_x2 (uint64_t * ptr, uint64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p64_x2 (poly64_t * ptr, poly64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1q_s64_x2 (int64_t * ptr, int64x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u64_x2 (uint64_t * ptr, uint64x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p64_x2 (poly64_t * ptr, poly64x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1_f64_x2 (float64_t * ptr, float64x1x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1q_f64_x2 (float64_t * ptr, float64x2x2_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1_s8_x3 (int8_t * ptr, int8x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s8_x3 (int8_t * ptr, int8x16x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s16_x3 (int16_t * ptr, int16x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s16_x3 (int16_t * ptr, int16x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s32_x3 (int32_t * ptr, int32x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s32_x3 (int32_t * ptr, int32x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u8_x3 (uint8_t * ptr, uint8x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u8_x3 (uint8_t * ptr, uint8x16x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u16_x3 (uint16_t * ptr, uint16x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u16_x3 (uint16_t * ptr, uint16x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u32_x3 (uint32_t * ptr, uint32x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u32_x3 (uint32_t * ptr, uint32x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f16_x3 (float16_t * ptr, float16x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f16_x3 (float16_t * ptr, float16x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f32_x3 (float32_t * ptr, float32x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f32_x3 (float32_t * ptr, float32x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p8_x3 (poly8_t * ptr, poly8x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p8_x3 (poly8_t * ptr, poly8x16x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p16_x3 (poly16_t * ptr, poly16x4x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p16_x3 (poly16_t * ptr, poly16x8x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s64_x3 (int64_t * ptr, int64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u64_x3 (uint64_t * ptr, uint64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p64_x3 (poly64_t * ptr, poly64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1q_s64_x3 (int64_t * ptr, int64x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u64_x3 (uint64_t * ptr, uint64x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p64_x3 (poly64_t * ptr, poly64x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f64_x3 (float64_t * ptr, float64x1x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1q_f64_x3 (float64_t * ptr, float64x2x3_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1_s8_x4 (int8_t * ptr, int8x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s8_x4 (int8_t * ptr, int8x16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s16_x4 (int16_t * ptr, int16x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s16_x4 (int16_t * ptr, int16x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s32_x4 (int32_t * ptr, int32x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_s32_x4 (int32_t * ptr, int32x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u8_x4 (uint8_t * ptr, uint8x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u8_x4 (uint8_t * ptr, uint8x16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u16_x4 (uint16_t * ptr, uint16x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u16_x4 (uint16_t * ptr, uint16x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u32_x4 (uint32_t * ptr, uint32x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u32_x4 (uint32_t * ptr, uint32x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f16_x4 (float16_t * ptr, float16x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f16_x4 (float16_t * ptr, float16x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_f32_x4 (float32_t * ptr, float32x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2S 

val.val[2] → Vt3.2S 

val.val[1] → Vt2.2S 

val.val[0] → Vt.2S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_f32_x4 (float32_t * ptr, float32x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4S 

val.val[2] → Vt3.4S 

val.val[1] → Vt2.4S 

val.val[0] → Vt.4S

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p8_x4 (poly8_t * ptr, poly8x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8B 

val.val[2] → Vt3.8B 

val.val[1] → Vt2.8B 

val.val[0] → Vt.8B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p8_x4 (poly8_t * ptr, poly8x16x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.16B 

val.val[2] → Vt3.16B 

val.val[1] → Vt2.16B 

val.val[0] → Vt.16B

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p16_x4 (poly16_t * ptr, poly16x4x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.4H 

val.val[2] → Vt3.4H 

val.val[1] → Vt2.4H 

val.val[0] → Vt.4H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p16_x4 (poly16_t * ptr, poly16x8x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.8H 

val.val[2] → Vt3.8H 

val.val[1] → Vt2.8H 

val.val[0] → Vt.8H

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_s64_x4 (int64_t * ptr, int64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_u64_x4 (uint64_t * ptr, uint64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1_p64_x4 (poly64_t * ptr, poly64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1q_s64_x4 (int64_t * ptr, int64x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_u64_x4 (uint64_t * ptr, uint64x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

void vst1q_p64_x4 (poly64_t * ptr, poly64x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

void vst1_f64_x4 (float64_t * ptr, float64x1x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.1D 

val.val[2] → Vt3.1D 

val.val[1] → Vt2.1D 

val.val[0] → Vt.1D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

void vst1q_f64_x4 (float64_t * ptr, float64x2x4_t val)Store a single-element structure from one lane of one register

Description

Store a single-element structure from one lane of one register. This instruction stores the specified element of a SIMD&FP register to memory.

A64 Instruction

ST1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn 

val.val[3] → Vt4.2D 

val.val[2] → Vt3.2D 

val.val[1] → Vt2.2D 

val.val[0] → Vt.2D

Results

void → result

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x2_t vld1_s8_x2 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x2_t vld1q_s8_x2 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x2_t vld1_s16_x2 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x2_t vld1q_s16_x2 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x2_t vld1_s32_x2 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x2_t vld1q_s32_x2 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x2_t vld1_u8_x2 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x2_t vld1q_u8_x2 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x2_t vld1_u16_x2 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x2_t vld1q_u16_x2 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x2_t vld1_u32_x2 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x2_t vld1q_u32_x2 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x2_t vld1_f16_x2 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x2_t vld1q_f16_x2 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x2_t vld1_f32_x2 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt2.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x2_t vld1q_f32_x2 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt2.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x2_t vld1_p8_x2 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt2.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x2_t vld1q_p8_x2 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt2.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x2_t vld1_p16_x2 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt2.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x2_t vld1q_p16_x2 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt2.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x2_t vld1_s64_x2 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x2_t vld1_u64_x2 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x2_t vld1_p64_x2 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x2_t vld1q_s64_x2 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2x2_t vld1q_u64_x2 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x2x2_t vld1q_p64_x2 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float64x1x2_t vld1_f64_x2 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt2.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x2_t vld1q_f64_x2 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt2.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x3_t vld1_s8_x3 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x3_t vld1q_s8_x3 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x3_t vld1_s16_x3 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x3_t vld1q_s16_x3 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x3_t vld1_s32_x3 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x3_t vld1q_s32_x3 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x3_t vld1_u8_x3 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x3_t vld1q_u8_x3 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x3_t vld1_u16_x3 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x3_t vld1q_u16_x3 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x3_t vld1_u32_x3 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x3_t vld1q_u32_x3 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x3_t vld1_f16_x3 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x3_t vld1q_f16_x3 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x3_t vld1_f32_x3 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt3.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x3_t vld1q_f32_x3 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt3.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x3_t vld1_p8_x3 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt3.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x3_t vld1q_p8_x3 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt3.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x3_t vld1_p16_x3 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt3.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x3_t vld1q_p16_x3 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt3.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x3_t vld1_s64_x3 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x3_t vld1_u64_x3 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x3_t vld1_p64_x3 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x3_t vld1q_s64_x3 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2x3_t vld1q_u64_x3 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x2x3_t vld1q_p64_x3 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float64x1x3_t vld1_f64_x3 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt3.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x3_t vld1q_f64_x3 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt3.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8x4_t vld1_s8_x4 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int8x16x4_t vld1q_s8_x4 (int8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x4x4_t vld1_s16_x4 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int16x8x4_t vld1q_s16_x4 (int16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x2x4_t vld1_s32_x4 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int32x4x4_t vld1q_s32_x4 (int32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x8x4_t vld1_u8_x4 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint8x16x4_t vld1q_u8_x4 (uint8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x4x4_t vld1_u16_x4 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint16x8x4_t vld1q_u16_x4 (uint16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x2x4_t vld1_u32_x4 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint32x4x4_t vld1q_u32_x4 (uint32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x4x4_t vld1_f16_x4 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float16x8x4_t vld1q_f16_x4 (float16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x2x4_t vld1_f32_x4 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2S - Vt4.2S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2S → result.val[3]
Vt3.2S → result.val[2]
Vt2.2S → result.val[1]
Vt.2S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

float32x4x4_t vld1q_f32_x4 (float32_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4S - Vt4.4S},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4S → result.val[3]
Vt3.4S → result.val[2]
Vt2.4S → result.val[1]
Vt.4S → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x8x4_t vld1_p8_x4 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8B - Vt4.8B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8B → result.val[3]
Vt3.8B → result.val[2]
Vt2.8B → result.val[1]
Vt.8B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly8x16x4_t vld1q_p8_x4 (poly8_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.16B - Vt4.16B},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.16B → result.val[3]
Vt3.16B → result.val[2]
Vt2.16B → result.val[1]
Vt.16B → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x4x4_t vld1_p16_x4 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.4H - Vt4.4H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.4H → result.val[3]
Vt3.4H → result.val[2]
Vt2.4H → result.val[1]
Vt.4H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly16x8x4_t vld1q_p16_x4 (poly16_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.8H - Vt4.8H},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.8H → result.val[3]
Vt3.8H → result.val[2]
Vt2.8H → result.val[1]
Vt.8H → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

int64x1x4_t vld1_s64_x4 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x1x4_t vld1_u64_x4 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x1x4_t vld1_p64_x4 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

int64x2x4_t vld1q_s64_x4 (int64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

uint64x2x4_t vld1q_u64_x4 (uint64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

v7/A32/A64

poly64x2x4_t vld1q_p64_x4 (poly64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A32/A64

float64x1x4_t vld1_f64_x4 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.1D - Vt4.1D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.1D → result.val[3]
Vt3.1D → result.val[2]
Vt2.1D → result.val[1]
Vt.1D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

float64x2x4_t vld1q_f64_x4 (float64_t const * ptr)Load one single-element structure to one lane of one register

Description

A64 Instruction

LD1 {Vt.2D - Vt4.2D},[Xn]

Argument Preparation

ptr → Xn

Results

Vt4.2D → result.val[3]
Vt3.2D → result.val[2]
Vt2.2D → result.val[1]
Vt.2D → result.val[0]

Operation

if HaveMTEExt() then
    SetNotTagCheckedInstruction(!wback && n == 31);

CheckFPAdvSIMDEnabled64();

bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

offs = Zeros();
if replicate then
    // load and replicate to all elements
    for s = 0 to selem-1
        element = Mem[address+offs, ebytes, AccType_VEC];
        // replicate to fill 128- or 64-bit register
        V[t] = Replicate(element, datasize DIV esize);
        offs = offs + ebytes;
        t = (t + 1) MOD 32;
else
    // load/store one element per register
    for s = 0 to selem-1
        rval = V[t];
        if memop == MemOp_LOAD then
            // insert into one lane of 128-bit register
            Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
            V[t] = rval;
        else // memop == MemOp_STORE
            // extract from one lane of 128-bit register
            Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
        offs = offs + ebytes;
        t = (t + 1) MOD 32;

if wback then
    if m != 31 then
        offs = X[m];
    if n == 31 then
        SP[] = address + offs;
    else
        X[n] = address + offs;

Supported architectures

A64

int8x8_t vpadd_s8 (int8x8_t a, int8x8_t b)Add pairwise

Description

Add Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements from the concatenated vector, adds each pair of values together, places the result into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

ADDP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vpadd_s16 (int16x4_t a, int16x4_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vpadd_s32 (int32x2_t a, int32x2_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vpadd_u8 (uint8x8_t a, uint8x8_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vpadd_u16 (uint16x4_t a, uint16x4_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vpadd_u32 (uint32x2_t a, uint32x2_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vpadd_f32 (float32x2_t a, float32x2_t b)Floating-point add pairwise

Description

Floating-point Add Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements from the concatenated vector, adds each pair of values together, places the result into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FADDP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vpaddq_s8 (int8x16_t a, int8x16_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int16x8_t vpaddq_s16 (int16x8_t a, int16x8_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int32x4_t vpaddq_s32 (int32x4_t a, int32x4_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int64x2_t vpaddq_s64 (int64x2_t a, int64x2_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint8x16_t vpaddq_u8 (uint8x16_t a, uint8x16_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint16x8_t vpaddq_u16 (uint16x8_t a, uint16x8_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint32x4_t vpaddq_u32 (uint32x4_t a, uint32x4_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint64x2_t vpaddq_u64 (uint64x2_t a, uint64x2_t b)Add pairwise

Description

A64 Instruction

ADDP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

float32x4_t vpaddq_f32 (float32x4_t a, float32x4_t b)Floating-point add pairwise

Description

A64 Instruction

FADDP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vpaddq_f64 (float64x2_t a, float64x2_t b)Floating-point add pairwise

Description

A64 Instruction

FADDP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int16x4_t vpaddl_s8 (int8x8_t a)Signed add long pairwise

Description

Signed Add Long Pairwise. This instruction adds pairs of adjacent signed integer values from the vector in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SADDLP Vd.4H,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vpaddlq_s8 (int8x16_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.8H,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vpaddl_s16 (int16x4_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.2S,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vpaddlq_s16 (int16x8_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.4S,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vpaddl_s32 (int32x2_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.1D,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vpaddlq_s32 (int32x4_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.2D,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vpaddl_u8 (uint8x8_t a)Unsigned add long pairwise

Description

Unsigned Add Long Pairwise. This instruction adds pairs of adjacent unsigned integer values from the vector in the source SIMD&FP register, places the result into a vector, and writes the vector to the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

UADDLP Vd.4H,Vn.8B

Argument Preparation

a → Vn.8B

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vpaddlq_u8 (uint8x16_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.8H,Vn.16B

Argument Preparation

a → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vpaddl_u16 (uint16x4_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.2S,Vn.4H

Argument Preparation

a → Vn.4H

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vpaddlq_u16 (uint16x8_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.4S,Vn.8H

Argument Preparation

a → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vpaddl_u32 (uint32x2_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.1D,Vn.2S

Argument Preparation

a → Vn.2S

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vpaddlq_u32 (uint32x4_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.2D,Vn.4S

Argument Preparation

a → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vpadal_s8 (int16x4_t a, int8x8_t b)Signed add and accumulate long pairwise

Description

Signed Add and Accumulate Long Pairwise. This instruction adds pairs of adjacent signed integer values from the vector in the source SIMD&FP register and accumulates the results into the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

SADALP Vd.4H,Vn.8B

Argument Preparation

a → Vd.4H 

b → Vn.8B

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vpadalq_s8 (int16x8_t a, int8x16_t b)Signed add and accumulate long pairwise

Description

A64 Instruction

SADALP Vd.8H,Vn.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vpadal_s16 (int32x2_t a, int16x4_t b)Signed add and accumulate long pairwise

Description

A64 Instruction

SADALP Vd.2S,Vn.4H

Argument Preparation

a → Vd.2S 

b → Vn.4H

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vpadalq_s16 (int32x4_t a, int16x8_t b)Signed add and accumulate long pairwise

Description

A64 Instruction

SADALP Vd.4S,Vn.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vpadal_s32 (int64x1_t a, int32x2_t b)Signed add and accumulate long pairwise

Description

A64 Instruction

SADALP Vd.1D,Vn.2S

Argument Preparation

a → Vd.1D 

b → Vn.2S

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vpadalq_s32 (int64x2_t a, int32x4_t b)Signed add and accumulate long pairwise

Description

A64 Instruction

SADALP Vd.2D,Vn.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vpadal_u8 (uint16x4_t a, uint8x8_t b)Unsigned add and accumulate long pairwise

Description

Unsigned Add and Accumulate Long Pairwise. This instruction adds pairs of adjacent unsigned integer values from the vector in the source SIMD&FP register and accumulates the results with the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the source vector elements.

A64 Instruction

UADALP Vd.4H,Vn.8B

Argument Preparation

a → Vd.4H 

b → Vn.8B

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vpadalq_u8 (uint16x8_t a, uint8x16_t b)Unsigned add and accumulate long pairwise

Description

A64 Instruction

UADALP Vd.8H,Vn.16B

Argument Preparation

a → Vd.8H 

b → Vn.16B

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vpadal_u16 (uint32x2_t a, uint16x4_t b)Unsigned add and accumulate long pairwise

Description

A64 Instruction

UADALP Vd.2S,Vn.4H

Argument Preparation

a → Vd.2S 

b → Vn.4H

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vpadalq_u16 (uint32x4_t a, uint16x8_t b)Unsigned add and accumulate long pairwise

Description

A64 Instruction

UADALP Vd.4S,Vn.8H

Argument Preparation

a → Vd.4S 

b → Vn.8H

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vpadal_u32 (uint64x1_t a, uint32x2_t b)Unsigned add and accumulate long pairwise

Description

A64 Instruction

UADALP Vd.1D,Vn.2S

Argument Preparation

a → Vd.1D 

b → Vn.2S

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vpadalq_u32 (uint64x2_t a, uint32x4_t b)Unsigned add and accumulate long pairwise

Description

A64 Instruction

UADALP Vd.2D,Vn.4S

Argument Preparation

a → Vd.2D 

b → Vn.4S

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vpmax_s8 (int8x8_t a, int8x8_t b)Signed maximum pairwise

Description

Signed Maximum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of signed integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SMAXP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vpmax_s16 (int16x4_t a, int16x4_t b)Signed maximum pairwise

Description

A64 Instruction

SMAXP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vpmax_s32 (int32x2_t a, int32x2_t b)Signed maximum pairwise

Description

A64 Instruction

SMAXP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vpmax_u8 (uint8x8_t a, uint8x8_t b)Unsigned maximum pairwise

Description

Unsigned Maximum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of unsigned integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UMAXP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vpmax_u16 (uint16x4_t a, uint16x4_t b)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vpmax_u32 (uint32x2_t a, uint32x2_t b)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vpmax_f32 (float32x2_t a, float32x2_t b)Floating-point maximum pairwise

Description

Floating-point Maximum Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements from the concatenated vector, writes the larger of each pair of values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMAXP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vpmaxq_s8 (int8x16_t a, int8x16_t b)Signed maximum pairwise

Description

A64 Instruction

SMAXP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int16x8_t vpmaxq_s16 (int16x8_t a, int16x8_t b)Signed maximum pairwise

Description

A64 Instruction

SMAXP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vpmaxq_s32 (int32x4_t a, int32x4_t b)Signed maximum pairwise

Description

A64 Instruction

SMAXP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8x16_t vpmaxq_u8 (uint8x16_t a, uint8x16_t b)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vpmaxq_u16 (uint16x8_t a, uint16x8_t b)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vpmaxq_u32 (uint32x4_t a, uint32x4_t b)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

float32x4_t vpmaxq_f32 (float32x4_t a, float32x4_t b)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vpmaxq_f64 (float64x2_t a, float64x2_t b)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8_t vpmin_s8 (int8x8_t a, int8x8_t b)Signed minimum pairwise

Description

Signed Minimum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair of signed integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

SMINP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vpmin_s16 (int16x4_t a, int16x4_t b)Signed minimum pairwise

Description

A64 Instruction

SMINP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vpmin_s32 (int32x2_t a, int32x2_t b)Signed minimum pairwise

Description

A64 Instruction

SMINP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vpmin_u8 (uint8x8_t a, uint8x8_t b)Unsigned minimum pairwise

Description

Unsigned Minimum Pairwise. This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair of unsigned integer values into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UMINP Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vpmin_u16 (uint16x4_t a, uint16x4_t b)Unsigned minimum pairwise

Description

A64 Instruction

UMINP Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vpmin_u32 (uint32x2_t a, uint32x2_t b)Unsigned minimum pairwise

Description

A64 Instruction

UMINP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vpmin_f32 (float32x2_t a, float32x2_t b)Floating-point minimum pairwise

Description

Floating-point Minimum Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements from the concatenated vector, writes the smaller of each pair of values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMINP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vpminq_s8 (int8x16_t a, int8x16_t b)Signed minimum pairwise

Description

A64 Instruction

SMINP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int16x8_t vpminq_s16 (int16x8_t a, int16x8_t b)Signed minimum pairwise

Description

A64 Instruction

SMINP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32x4_t vpminq_s32 (int32x4_t a, int32x4_t b)Signed minimum pairwise

Description

A64 Instruction

SMINP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint8x16_t vpminq_u8 (uint8x16_t a, uint8x16_t b)Unsigned minimum pairwise

Description

A64 Instruction

UMINP Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint16x8_t vpminq_u16 (uint16x8_t a, uint16x8_t b)Unsigned minimum pairwise

Description

A64 Instruction

UMINP Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32x4_t vpminq_u32 (uint32x4_t a, uint32x4_t b)Unsigned minimum pairwise

Description

A64 Instruction

UMINP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

float32x4_t vpminq_f32 (float32x4_t a, float32x4_t b)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vpminq_f64 (float64x2_t a, float64x2_t b)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vpmaxnm_f32 (float32x2_t a, float32x2_t b)Floating-point maximum number pairwise

Description

Floating-point Maximum Number Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the largest of each pair of values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMAXNMP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vpmaxnmq_f32 (float32x4_t a, float32x4_t b)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vpmaxnmq_f64 (float64x2_t a, float64x2_t b)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vpminnm_f32 (float32x2_t a, float32x2_t b)Floating-point minimum number pairwise

Description

Floating-point Minimum Number Pairwise (vector). This instruction creates a vector by concatenating the vector elements of the first source SIMD&FP register after the vector elements of the second source SIMD&FP register, reads each pair of adjacent vector elements in the two source SIMD&FP registers, writes the smallest of each pair of floating-point values into a vector, and writes the vector to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMINNMP Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vpminnmq_f32 (float32x4_t a, float32x4_t b)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vpminnmq_f64 (float64x2_t a, float64x2_t b)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int64_t vpaddd_s64 (int64x2_t a)Add pairwise

Description

A64 Instruction

ADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint64_t vpaddd_u64 (uint64x2_t a)Add pairwise

Description

A64 Instruction

ADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

float32_t vpadds_f32 (float32x2_t a)Floating-point add pairwise

Description

A64 Instruction

FADDP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vpaddd_f64 (float64x2_t a)Floating-point add pairwise

Description

A64 Instruction

FADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vpmaxs_f32 (float32x2_t a)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vpmaxqd_f64 (float64x2_t a)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vpmins_f32 (float32x2_t a)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vpminqd_f64 (float64x2_t a)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vpmaxnms_f32 (float32x2_t a)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vpmaxnmqd_f64 (float64x2_t a)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vpminnms_f32 (float32x2_t a)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vpminnmqd_f64 (float64x2_t a)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8_t vaddv_s8 (int8x8_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

int8_t vaddvq_s8 (int8x16_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

int16_t vaddv_s16 (int16x4_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

int16_t vaddvq_s16 (int16x8_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

int32_t vaddv_s32 (int32x2_t a)Add pairwise

Description

A64 Instruction

ADDP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

int32_t vaddvq_s32 (int32x4_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

int64_t vaddvq_s64 (int64x2_t a)Add pairwise

Description

A64 Instruction

ADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint8_t vaddv_u8 (uint8x8_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

uint8_t vaddvq_u8 (uint8x16_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

uint16_t vaddv_u16 (uint16x4_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

uint16_t vaddvq_u16 (uint16x8_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

uint32_t vaddv_u32 (uint32x2_t a)Add pairwise

Description

A64 Instruction

ADDP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

uint32_t vaddvq_u32 (uint32x4_t a)Add across vector

Description

Add across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register.

A64 Instruction

ADDV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_ADD, operand, esize);

Supported architectures

A64

uint64_t vaddvq_u64 (uint64x2_t a)Add pairwise

Description

A64 Instruction

ADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[concat, 2*e, esize];
    element2 = Elem[concat, (2*e)+1, esize];
    Elem[result, e, esize] = element1 + element2;

V[d] = result;

Supported architectures

A64

float32_t vaddv_f32 (float32x2_t a)Floating-point add pairwise

Description

A64 Instruction

FADDP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vaddvq_f32 (float32x4_t a)Floating-point add pairwise

Description

A64 Instruction

FADDP Vt.4S,Vn.4S,Vm.4S
FADDP Sd,Vt.2S

Argument Preparation

a → Vn.4S 

a → Vm.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vaddvq_f64 (float64x2_t a)Floating-point add pairwise

Description

A64 Instruction

FADDP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];
    Elem[result, e, esize] = FPAdd(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int16_t vaddlv_s8 (int8x8_t a)Signed add long across vector

Description

Signed Add Long across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register. The destination scalar is twice as long as the source vector elements. All the values in this instruction are signed integer values.

A64 Instruction

SADDLV Hd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

int16_t vaddlvq_s8 (int8x16_t a)Signed add long across vector

Description

A64 Instruction

SADDLV Hd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

int32_t vaddlv_s16 (int16x4_t a)Signed add long across vector

Description

A64 Instruction

SADDLV Sd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

int32_t vaddlvq_s16 (int16x8_t a)Signed add long across vector

Description

A64 Instruction

SADDLV Sd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

int64_t vaddlv_s32 (int32x2_t a)Signed add long pairwise

Description

A64 Instruction

SADDLP Vd.1D,Vn.2S

Argument Preparation

a → Vn.2S

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

A64

int64_t vaddlvq_s32 (int32x4_t a)Signed add long across vector

Description

A64 Instruction

SADDLV Dd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

uint16_t vaddlv_u8 (uint8x8_t a)Unsigned sum long across vector

Description

Unsigned sum Long across Vector. This instruction adds every vector element in the source SIMD&FP register together, and writes the scalar result to the destination SIMD&FP register. The destination scalar is twice as long as the source vector elements. All the values in this instruction are unsigned integer values.

A64 Instruction

UADDLV Hd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

uint16_t vaddlvq_u8 (uint8x16_t a)Unsigned sum long across vector

Description

A64 Instruction

UADDLV Hd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

uint32_t vaddlv_u16 (uint16x4_t a)Unsigned sum long across vector

Description

A64 Instruction

UADDLV Sd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

uint32_t vaddlvq_u16 (uint16x8_t a)Unsigned sum long across vector

Description

A64 Instruction

UADDLV Sd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

uint64_t vaddlv_u32 (uint32x2_t a)Unsigned add long pairwise

Description

A64 Instruction

UADDLP Vd.1D,Vn.2S

Argument Preparation

a → Vn.2S

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;

bits(2*esize) sum;
integer op1;
integer op2;

result = if acc then V[d] else Zeros();
for e = 0 to elements-1
    op1 = Int(Elem[operand, 2*e+0, esize], unsigned);
    op2 = Int(Elem[operand, 2*e+1, esize], unsigned);
    sum = (op1+op2)<2*esize-1:0>;
    Elem[result, e, 2*esize] = Elem[result, e, 2*esize] + sum;

V[d] = result;

Supported architectures

A64

uint64_t vaddlvq_u32 (uint32x4_t a)Unsigned sum long across vector

Description

A64 Instruction

UADDLV Dd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer sum;

sum = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    sum = sum + Int(Elem[operand, e, esize], unsigned);

V[d] = sum<2*esize-1:0>;

Supported architectures

A64

int8_t vmaxv_s8 (int8x8_t a)Signed maximum across vector

Description

Signed Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are signed integer values.

A64 Instruction

SMAXV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int8_t vmaxvq_s8 (int8x16_t a)Signed maximum across vector

Description

A64 Instruction

SMAXV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int16_t vmaxv_s16 (int16x4_t a)Signed maximum across vector

Description

A64 Instruction

SMAXV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int16_t vmaxvq_s16 (int16x8_t a)Signed maximum across vector

Description

A64 Instruction

SMAXV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int32_t vmaxv_s32 (int32x2_t a)Signed maximum pairwise

Description

A64 Instruction

SMAXP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32_t vmaxvq_s32 (int32x4_t a)Signed maximum across vector

Description

A64 Instruction

SMAXV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint8_t vmaxv_u8 (uint8x8_t a)Unsigned maximum across vector

Description

Unsigned Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are unsigned integer values.

A64 Instruction

UMAXV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint8_t vmaxvq_u8 (uint8x16_t a)Unsigned maximum across vector

Description

A64 Instruction

UMAXV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint16_t vmaxv_u16 (uint16x4_t a)Unsigned maximum across vector

Description

A64 Instruction

UMAXV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint16_t vmaxvq_u16 (uint16x8_t a)Unsigned maximum across vector

Description

A64 Instruction

UMAXV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint32_t vmaxv_u32 (uint32x2_t a)Unsigned maximum pairwise

Description

A64 Instruction

UMAXP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32_t vmaxvq_u32 (uint32x4_t a)Unsigned maximum across vector

Description

A64 Instruction

UMAXV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

float32_t vmaxv_f32 (float32x2_t a)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmaxvq_f32 (float32x4_t a)Floating-point maximum across vector

Description

Floating-point Maximum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMAXV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_FMAX, operand, esize);

Supported architectures

A64

float64_t vmaxvq_f64 (float64x2_t a)Floating-point maximum pairwise

Description

A64 Instruction

FMAXP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8_t vminv_s8 (int8x8_t a)Signed minimum across vector

Description

Signed Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are signed integer values.

A64 Instruction

SMINV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int8_t vminvq_s8 (int8x16_t a)Signed minimum across vector

Description

A64 Instruction

SMINV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int16_t vminv_s16 (int16x4_t a)Signed minimum across vector

Description

A64 Instruction

SMINV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int16_t vminvq_s16 (int16x8_t a)Signed minimum across vector

Description

A64 Instruction

SMINV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

int32_t vminv_s32 (int32x2_t a)Signed minimum pairwise

Description

A64 Instruction

SMINP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

int32_t vminvq_s32 (int32x4_t a)Signed minimum across vector

Description

A64 Instruction

SMINV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint8_t vminv_u8 (uint8x8_t a)Unsigned minimum across vector

Description

Unsigned Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are unsigned integer values.

A64 Instruction

UMINV Bd,Vn.8B

Argument Preparation

a → Vn.8B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint8_t vminvq_u8 (uint8x16_t a)Unsigned minimum across vector

Description

A64 Instruction

UMINV Bd,Vn.16B

Argument Preparation

a → Vn.16B

Results

Bd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint16_t vminv_u16 (uint16x4_t a)Unsigned minimum across vector

Description

A64 Instruction

UMINV Hd,Vn.4H

Argument Preparation

a → Vn.4H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint16_t vminvq_u16 (uint16x8_t a)Unsigned minimum across vector

Description

A64 Instruction

UMINV Hd,Vn.8H

Argument Preparation

a → Vn.8H

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

uint32_t vminv_u32 (uint32x2_t a)Unsigned minimum pairwise

Description

A64 Instruction

UMINP  Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

a → Vm.2S

Results

Vd.S[0] → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
integer element1;
integer element2;
integer maxmin;

for e = 0 to elements-1
    element1 = Int(Elem[concat, 2*e, esize], unsigned);
    element2 = Int(Elem[concat, (2*e)+1, esize], unsigned);
    maxmin = if minimum then Min(element1, element2) else Max(element1, element2);
    Elem[result, e, esize] = maxmin<esize-1:0>;

V[d] = result;

Supported architectures

A64

uint32_t vminvq_u32 (uint32x4_t a)Unsigned minimum across vector

Description

A64 Instruction

UMINV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
integer maxmin;
integer element;

maxmin = Int(Elem[operand, 0, esize], unsigned);
for e = 1 to elements-1
    element = Int(Elem[operand, e, esize], unsigned);
    maxmin = if min then Min(maxmin, element) else Max(maxmin, element);

V[d] = maxmin<esize-1:0>;

Supported architectures

A64

float32_t vminv_f32 (float32x2_t a)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vminvq_f32 (float32x4_t a)Floating-point minimum across vector

Description

Floating-point Minimum across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMINV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_FMIN, operand, esize);

Supported architectures

A64

float64_t vminvq_f64 (float64x2_t a)Floating-point minimum pairwise

Description

A64 Instruction

FMINP Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMin(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMax(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmaxnmv_f32 (float32x2_t a)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vmaxnmvq_f32 (float32x4_t a)Floating-point maximum number across vector

Description

Floating-point Maximum Number across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the largest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMAXNMV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_FMAXNUM, operand, esize);

Supported architectures

A64

float64_t vmaxnmvq_f64 (float64x2_t a)Floating-point maximum number pairwise

Description

A64 Instruction

FMAXNMP  Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vminnmv_f32 (float32x2_t a)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP  Sd,Vn.2S

Argument Preparation

a → Vn.2S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32_t vminnmvq_f32 (float32x4_t a)Floating-point minimum number across vector

Description

Floating-point Minimum Number across Vector. This instruction compares all the vector elements in the source SIMD&FP register, and writes the smallest of the values as a scalar to the destination SIMD&FP register. All the values in this instruction are floating-point values.

A64 Instruction

FMINNMV Sd,Vn.4S

Argument Preparation

a → Vn.4S

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
V[d] = Reduce(ReduceOp_FMINNUM, operand, esize);

Supported architectures

A64

float64_t vminnmvq_f64 (float64x2_t a)Floating-point minimum number pairwise

Description

A64 Instruction

FMINNMP  Dd,Vn.2D

Argument Preparation

a → Vn.2D

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;
bits(2*datasize) concat = operand2:operand1;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    if pair then
        element1 = Elem[concat, 2*e, esize];
        element2 = Elem[concat, (2*e)+1, esize];
    else
        element1 = Elem[operand1, e, esize];
        element2 = Elem[operand2, e, esize];

    if minimum then
        Elem[result, e, esize] = FPMinNum(element1, element2, FPCR);
    else
        Elem[result, e, esize] = FPMaxNum(element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8_t vext_s8 (int8x8_t a, int8x8_t b, const int n)Extract vector from pair of vectors

Description

Extract vector from pair of vectors. This instruction extracts the lowest vector elements from the second source SIMD&FP register and the highest vector elements from the first source SIMD&FP register, concatenates the results into a vector, and writes the vector to the destination SIMD&FP register vector. The index value specifies the lowest vector element to extract from the first source register, and consecutive elements are extracted from the first, then second, source registers until the destination vector is filled.

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#n

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int8x16_t vextq_s8 (int8x16_t a, int8x16_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#n

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int16x4_t vext_s16 (int16x4_t a, int16x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<1)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 3

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int16x8_t vextq_s16 (int16x8_t a, int16x8_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<1)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int32x2_t vext_s32 (int32x2_t a, int32x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<2)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 1

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int32x4_t vextq_s32 (int32x4_t a, int32x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<2)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 3

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int64x1_t vext_s64 (int64x1_t a, int64x1_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<3)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 0

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int64x2_t vextq_s64 (int64x2_t a, int64x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<3)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 1

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint8x8_t vext_u8 (uint8x8_t a, uint8x8_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#n

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint8x16_t vextq_u8 (uint8x16_t a, uint8x16_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#n

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint16x4_t vext_u16 (uint16x4_t a, uint16x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<1)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 3

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint16x8_t vextq_u16 (uint16x8_t a, uint16x8_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<1)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint32x2_t vext_u32 (uint32x2_t a, uint32x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<2)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 1

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint32x4_t vextq_u32 (uint32x4_t a, uint32x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<2)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 3

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint64x1_t vext_u64 (uint64x1_t a, uint64x1_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<3)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 0

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

uint64x2_t vextq_u64 (uint64x2_t a, uint64x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<3)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 1

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

poly64x1_t vext_p64 (poly64x1_t a, poly64x1_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<3)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 0

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

A32/A64

poly64x2_t vextq_p64 (poly64x2_t a, poly64x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<3)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 1

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

A32/A64

float32x2_t vext_f32 (float32x2_t a, float32x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<2)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 1

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

float32x4_t vextq_f32 (float32x4_t a, float32x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<2)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 3

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

float64x1_t vext_f64 (float64x1_t a, float64x1_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<3)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 0

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

A64

float64x2_t vextq_f64 (float64x2_t a, float64x2_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<3)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 1

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

A64

poly8x8_t vext_p8 (poly8x8_t a, poly8x8_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#n

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

poly8x16_t vextq_p8 (poly8x16_t a, poly8x16_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#n

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

poly16x4_t vext_p16 (poly16x4_t a, poly16x4_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.8B,Vn.8B,Vm.8B,#(n<<1)

Argument Preparation

a → Vn.8B 

b → Vm.8B 

0 << n << 3

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

poly16x8_t vextq_p16 (poly16x8_t a, poly16x8_t b, const int n)Extract vector from pair of vectors

Description

A64 Instruction

EXT Vd.16B,Vn.16B,Vm.16B,#(n<<1)

Argument Preparation

a → Vn.16B 

b → Vm.16B 

0 << n << 7

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) hi = V[m];
bits(datasize) lo = V[n];
bits(datasize*2) concat = hi:lo;

V[d] = concat<position+datasize-1:position>;

Supported architectures

v7/A32/A64

int8x8_t vrev64_s8 (int8x8_t vec)Reverse elements in 64-bit doublewords

Description

Reverse elements in 64-bit doublewords (vector). This instruction reverses the order of 8-bit, 16-bit, or 32-bit elements in each doubleword of the vector in the source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

REV64 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrev64q_s8 (int8x16_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrev64_s16 (int16x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrev64q_s16 (int16x8_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vrev64_s32 (int32x2_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.2S,Vn.2S

Argument Preparation

vec → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vrev64q_s32 (int32x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4S,Vn.4S

Argument Preparation

vec → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrev64_u8 (uint8x8_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrev64q_u8 (uint8x16_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrev64_u16 (uint16x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrev64q_u16 (uint16x8_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vrev64_u32 (uint32x2_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.2S,Vn.2S

Argument Preparation

vec → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vrev64q_u32 (uint32x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4S,Vn.4S

Argument Preparation

vec → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vrev64_f32 (float32x2_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.2S,Vn.2S

Argument Preparation

vec → Vn.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vrev64q_f32 (float32x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4S,Vn.4S

Argument Preparation

vec → Vn.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vrev64_p8 (poly8x8_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vrev64q_p8 (poly8x16_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vrev64_p16 (poly16x4_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vrev64q_p16 (poly16x8_t vec)Reverse elements in 64-bit doublewords

Description

A64 Instruction

REV64 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vrev32_s8 (int8x8_t vec)Reverse elements in 32-bit words

Description

Reverse elements in 32-bit words (vector). This instruction reverses the order of 8-bit or 16-bit elements in each word of the vector in the source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

REV32 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrev32q_s8 (int8x16_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vrev32_s16 (int16x4_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vrev32q_s16 (int16x8_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrev32_u8 (uint8x8_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrev32q_u8 (uint8x16_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vrev32_u16 (uint16x4_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vrev32q_u16 (uint16x8_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vrev32_p8 (poly8x8_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vrev32q_p8 (poly8x16_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vrev32_p16 (poly16x4_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.4H,Vn.4H

Argument Preparation

vec → Vn.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vrev32q_p16 (poly16x8_t vec)Reverse elements in 32-bit words

Description

A64 Instruction

REV32 Vd.8H,Vn.8H

Argument Preparation

vec → Vn.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vrev16_s8 (int8x8_t vec)Reverse elements in 16-bit halfwords

Description

Reverse elements in 16-bit halfwords (vector). This instruction reverses the order of 8-bit elements in each halfword of the vector in the source SIMD&FP register, places the results into a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

REV16 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x16_t vrev16q_s8 (int8x16_t vec)Reverse elements in 16-bit halfwords

Description

A64 Instruction

REV16 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vrev16_u8 (uint8x8_t vec)Reverse elements in 16-bit halfwords

Description

A64 Instruction

REV16 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16_t vrev16q_u8 (uint8x16_t vec)Reverse elements in 16-bit halfwords

Description

A64 Instruction

REV16 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vrev16_p8 (poly8x8_t vec)Reverse elements in 16-bit halfwords

Description

A64 Instruction

REV16 Vd.8B,Vn.8B

Argument Preparation

vec → Vn.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vrev16q_p8 (poly8x16_t vec)Reverse elements in 16-bit halfwords

Description

A64 Instruction

REV16 Vd.16B,Vn.16B

Argument Preparation

vec → Vn.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
integer element = 0;
integer rev_element;
for c = 0 to containers-1
    rev_element = element + elements_per_container - 1;
    for e = 0 to elements_per_container-1
        Elem[result, rev_element, esize] = Elem[operand, element, esize];
        element = element + 1;
        rev_element = rev_element - 1;

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vzip1_s8 (int8x8_t a, int8x8_t b)Zip vectors

Description

Zip vectors (primary). This instruction reads adjacent vector elements from the upper half of two source SIMD&FP registers as pairs, interleaves the pairs and places them into a vector, and writes the vector to the destination SIMD&FP register. The first pair from the first source register is placed into the two lowest vector elements, with subsequent pairs taken alternately from each source register.

A64 Instruction

ZIP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vzip1q_s8 (int8x16_t a, int8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vzip1_s16 (int16x4_t a, int16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vzip1q_s16 (int16x8_t a, int16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vzip1_s32 (int32x2_t a, int32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vzip1q_s32 (int32x4_t a, int32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vzip1q_s64 (int64x2_t a, int64x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vzip1_u8 (uint8x8_t a, uint8x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vzip1q_u8 (uint8x16_t a, uint8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vzip1_u16 (uint16x4_t a, uint16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vzip1q_u16 (uint16x8_t a, uint16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vzip1_u32 (uint32x2_t a, uint32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vzip1q_u32 (uint32x4_t a, uint32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vzip1q_u64 (uint64x2_t a, uint64x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vzip1q_p64 (poly64x2_t a, poly64x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vzip1_f32 (float32x2_t a, float32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vzip1q_f32 (float32x4_t a, float32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vzip1q_f64 (float64x2_t a, float64x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vzip1_p8 (poly8x8_t a, poly8x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vzip1q_p8 (poly8x16_t a, poly8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vzip1_p16 (poly16x4_t a, poly16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vzip1q_p16 (poly16x8_t a, poly16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vzip2_s8 (int8x8_t a, int8x8_t b)Zip vectors

Description

Zip vectors (secondary). This instruction reads adjacent vector elements from the lower half of two source SIMD&FP registers as pairs, interleaves the pairs and places them into a vector, and writes the vector to the destination SIMD&FP register. The first pair from the first source register is placed into the two lowest vector elements, with subsequent pairs taken alternately from each source register.

A64 Instruction

ZIP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vzip2q_s8 (int8x16_t a, int8x16_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vzip2_s16 (int16x4_t a, int16x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vzip2q_s16 (int16x8_t a, int16x8_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vzip2_s32 (int32x2_t a, int32x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vzip2q_s32 (int32x4_t a, int32x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vzip2q_s64 (int64x2_t a, int64x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vzip2_u8 (uint8x8_t a, uint8x8_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vzip2q_u8 (uint8x16_t a, uint8x16_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vzip2_u16 (uint16x4_t a, uint16x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vzip2q_u16 (uint16x8_t a, uint16x8_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vzip2_u32 (uint32x2_t a, uint32x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vzip2q_u32 (uint32x4_t a, uint32x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vzip2q_u64 (uint64x2_t a, uint64x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vzip2q_p64 (poly64x2_t a, poly64x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vzip2_f32 (float32x2_t a, float32x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vzip2q_f32 (float32x4_t a, float32x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vzip2q_f64 (float64x2_t a, float64x2_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vzip2_p8 (poly8x8_t a, poly8x8_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vzip2q_p8 (poly8x16_t a, poly8x16_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vzip2_p16 (poly16x4_t a, poly16x4_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vzip2q_p16 (poly16x8_t a, poly16x8_t b)Zip vectors

Description

A64 Instruction

ZIP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vuzp1_s8 (int8x8_t a, int8x8_t b)Unzip vectors

Description

Unzip vectors (primary). This instruction reads corresponding even-numbered vector elements from the two source SIMD&FP registers, starting at zero, places the result from the first source register into consecutive elements in the lower half of a vector, and the result from the second source register into consecutive elements in the upper half of a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UZP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vuzp1q_s8 (int8x16_t a, int8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vuzp1_s16 (int16x4_t a, int16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vuzp1q_s16 (int16x8_t a, int16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vuzp1_s32 (int32x2_t a, int32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vuzp1q_s32 (int32x4_t a, int32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vuzp1q_s64 (int64x2_t a, int64x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vuzp1_u8 (uint8x8_t a, uint8x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vuzp1q_u8 (uint8x16_t a, uint8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vuzp1_u16 (uint16x4_t a, uint16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vuzp1q_u16 (uint16x8_t a, uint16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vuzp1_u32 (uint32x2_t a, uint32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vuzp1q_u32 (uint32x4_t a, uint32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vuzp1q_u64 (uint64x2_t a, uint64x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vuzp1q_p64 (poly64x2_t a, poly64x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vuzp1_f32 (float32x2_t a, float32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vuzp1q_f32 (float32x4_t a, float32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vuzp1q_f64 (float64x2_t a, float64x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vuzp1_p8 (poly8x8_t a, poly8x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vuzp1q_p8 (poly8x16_t a, poly8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vuzp1_p16 (poly16x4_t a, poly16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vuzp1q_p16 (poly16x8_t a, poly16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vuzp2_s8 (int8x8_t a, int8x8_t b)Unzip vectors

Description

Unzip vectors (secondary). This instruction reads corresponding odd-numbered vector elements from the two source SIMD&FP registers, places the result from the first source register into consecutive elements in the lower half of a vector, and the result from the second source register into consecutive elements in the upper half of a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

UZP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vuzp2q_s8 (int8x16_t a, int8x16_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vuzp2_s16 (int16x4_t a, int16x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vuzp2q_s16 (int16x8_t a, int16x8_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vuzp2_s32 (int32x2_t a, int32x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vuzp2q_s32 (int32x4_t a, int32x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vuzp2q_s64 (int64x2_t a, int64x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vuzp2_u8 (uint8x8_t a, uint8x8_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vuzp2q_u8 (uint8x16_t a, uint8x16_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vuzp2_u16 (uint16x4_t a, uint16x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vuzp2q_u16 (uint16x8_t a, uint16x8_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vuzp2_u32 (uint32x2_t a, uint32x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vuzp2q_u32 (uint32x4_t a, uint32x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vuzp2q_u64 (uint64x2_t a, uint64x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vuzp2q_p64 (poly64x2_t a, poly64x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vuzp2_f32 (float32x2_t a, float32x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vuzp2q_f32 (float32x4_t a, float32x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vuzp2q_f64 (float64x2_t a, float64x2_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vuzp2_p8 (poly8x8_t a, poly8x8_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vuzp2q_p8 (poly8x16_t a, poly8x16_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vuzp2_p16 (poly16x4_t a, poly16x4_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vuzp2q_p16 (poly16x8_t a, poly16x8_t b)Unzip vectors

Description

A64 Instruction

UZP2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vtrn1_s8 (int8x8_t a, int8x8_t b)Transpose vectors

Description

Transpose vectors (primary). This instruction reads corresponding even-numbered vector elements from the two source SIMD&FP registers, starting at zero, places each result into consecutive elements of a vector, and writes the vector to the destination SIMD&FP register. Vector elements from the first source register are placed into even-numbered elements of the destination vector, starting at zero, while vector elements from the second source register are placed into odd-numbered elements of the destination vector.

A64 Instruction

TRN1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vtrn1q_s8 (int8x16_t a, int8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vtrn1_s16 (int16x4_t a, int16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vtrn1q_s16 (int16x8_t a, int16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vtrn1_s32 (int32x2_t a, int32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vtrn1q_s32 (int32x4_t a, int32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vtrn1q_s64 (int64x2_t a, int64x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vtrn1_u8 (uint8x8_t a, uint8x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vtrn1q_u8 (uint8x16_t a, uint8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vtrn1_u16 (uint16x4_t a, uint16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vtrn1q_u16 (uint16x8_t a, uint16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vtrn1_u32 (uint32x2_t a, uint32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vtrn1q_u32 (uint32x4_t a, uint32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vtrn1q_u64 (uint64x2_t a, uint64x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vtrn1q_p64 (poly64x2_t a, poly64x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vtrn1_f32 (float32x2_t a, float32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vtrn1q_f32 (float32x4_t a, float32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vtrn1q_f64 (float64x2_t a, float64x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vtrn1_p8 (poly8x8_t a, poly8x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vtrn1q_p8 (poly8x16_t a, poly8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vtrn1_p16 (poly16x4_t a, poly16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vtrn1q_p16 (poly16x8_t a, poly16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vtrn2_s8 (int8x8_t a, int8x8_t b)Transpose vectors

Description

Transpose vectors (secondary). This instruction reads corresponding odd-numbered vector elements from the two source SIMD&FP registers, places each result into consecutive elements of a vector, and writes the vector to the destination SIMD&FP register. Vector elements from the first source register are placed into even-numbered elements of the destination vector, starting at zero, while vector elements from the second source register are placed into odd-numbered elements of the destination vector.

A64 Instruction

TRN2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int8x16_t vtrn2q_s8 (int8x16_t a, int8x16_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int16x4_t vtrn2_s16 (int16x4_t a, int16x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int16x8_t vtrn2q_s16 (int16x8_t a, int16x8_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int32x2_t vtrn2_s32 (int32x2_t a, int32x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int32x4_t vtrn2q_s32 (int32x4_t a, int32x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int64x2_t vtrn2q_s64 (int64x2_t a, int64x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint8x8_t vtrn2_u8 (uint8x8_t a, uint8x8_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint8x16_t vtrn2q_u8 (uint8x16_t a, uint8x16_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint16x4_t vtrn2_u16 (uint16x4_t a, uint16x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint16x8_t vtrn2q_u16 (uint16x8_t a, uint16x8_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint32x2_t vtrn2_u32 (uint32x2_t a, uint32x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint32x4_t vtrn2q_u32 (uint32x4_t a, uint32x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

uint64x2_t vtrn2q_u64 (uint64x2_t a, uint64x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly64x2_t vtrn2q_p64 (poly64x2_t a, poly64x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float32x2_t vtrn2_f32 (float32x2_t a, float32x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float32x4_t vtrn2q_f32 (float32x4_t a, float32x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

float64x2_t vtrn2q_f64 (float64x2_t a, float64x2_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.2D,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly8x8_t vtrn2_p8 (poly8x8_t a, poly8x8_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly8x16_t vtrn2q_p8 (poly8x16_t a, poly8x16_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly16x4_t vtrn2_p16 (poly16x4_t a, poly16x4_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

poly16x8_t vtrn2q_p16 (poly16x8_t a, poly16x8_t b)Transpose vectors

Description

A64 Instruction

TRN2 Vd.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

A64

int8x8_t vtbl1_s8 (int8x8_t a, int8x8_t b)Table vector lookup

Description

Table vector Lookup. This instruction reads each value from the vector elements in the index source SIMD&FP register, uses each result as an index to perform a lookup in a table of bytes that is described by one to four source table SIMD&FP registers, places the lookup result in a vector, and writes the vector to the destination SIMD&FP register. If an index is out of range for the table, the result for that lookup is 0. If more than one source register is used to describe the table, the first source register describes the lowest bytes of the table.

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → Zeros(64):a 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbl1_u8 (uint8x8_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → Zeros(64):a 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbl1_p8 (poly8x8_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → Zeros(64):a 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vtbx1_s8 (int8x8_t a, int8x8_t b, int8x8_t c)Bitwise insert if false

Description

Bitwise Insert if False. This instruction inserts each bit from the first source SIMD&FP register into the destination SIMD&FP register if the corresponding bit of the second source SIMD&FP register is 0, otherwise leaves the bit in the destination register unchanged.

A64 Instruction

MOVI Vtmp.8B,#8
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B},Vm.8B
BIF Vd.8B,Vtmp1.8B,Vtmp.8B

Argument Preparation

a → Vd 

Vn → Zeros(64):b 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x8_t vtbx1_u8 (uint8x8_t a, uint8x8_t b, uint8x8_t c)Bitwise insert if false

Description

A64 Instruction

MOVI Vtmp.8B,#8
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B},Vm.8B
BIF Vd.8B,Vtmp1.8B,Vtmp.8B

Argument Preparation

a → Vd 

Vn → Zeros(64):b 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly8x8_t vtbx1_p8 (poly8x8_t a, poly8x8_t b, uint8x8_t c)Bitwise insert if false

Description

A64 Instruction

MOVI Vtmp.8B,#8
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B},Vm.8B
BIF Vd.8B,Vtmp1.8B, Vtmp.8B

Argument Preparation

a → Vd 

Vn → Zeros(64):b 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int8x8_t vtbl2_s8 (int8x8x2_t a, int8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbl2_u8 (uint8x8x2_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbl2_p8 (poly8x8x2_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vtbl3_s8 (int8x8x3_t a, int8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → Zeros(64):a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbl3_u8 (uint8x8x3_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → Zeros(64):a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbl3_p8 (poly8x8x3_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → Zeros(64):a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vtbl4_s8 (int8x8x4_t a, int8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → a.val[3]:a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbl4_u8 (uint8x8x4_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → a.val[3]:a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbl4_p8 (poly8x8x4_t a, uint8x8_t b)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

Vn → a.val[1]:a.val[0] 

Vn+1 → a.val[3]:a.val[2] 

b → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vtbx2_s8 (int8x8_t a, int8x8x2_t b, int8x8_t c)Table vector lookup extension

Description

Table vector lookup extension. This instruction reads each value from the vector elements in the index source SIMD&FP register, uses each result as an index to perform a lookup in a table of bytes that is described by one to four source table SIMD&FP registers, places the lookup result in a vector, and writes the vector to the destination SIMD&FP register. If an index is out of range for the table, the existing value in the vector element of the destination register is left unchanged. If more than one source register is used to describe the table, the first source register describes the lowest bytes of the table.

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbx2_u8 (uint8x8_t a, uint8x8x2_t b, uint8x8_t c)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbx2_p8 (poly8x8_t a, poly8x8x2_t b, uint8x8_t c)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vtbx3_s8 (int8x8_t a, int8x8x3_t b, int8x8_t c)Bitwise insert if false

Description

A64 Instruction

MOVI Vtmp.8B,#24
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B,Vn+1.16B},Vm.8
BIF Vd.8B,Vtmp1.8B,Vtmp.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → Zeros(64):b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

uint8x8_t vtbx3_u8 (uint8x8_t a, uint8x8x3_t b, uint8x8_t c)Bitwise insert if false

Description

A64 Instruction

MOVI Vtmp.8B,#24
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B,Vn+1.16B},Vm.8B
BIF Vd.8B,Vtmp1.8B,Vtmp.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → Zeros(64):b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

poly8x8_t vtbx3_p8 (poly8x8_t a, poly8x8x3_t b, uint8x8_t c)Bitwise insert if false

Description

A64 Instruction

MOVI Vtmp.8B,#24
CMHS Vtmp.8B,Vm.8B,Vtmp.8B
TBL Vtmp1.8B,{Vn.16B,Vn+1.16B},Vm.8B
BIF Vd.8B,Vtmp1.8B,Vtmp.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → Zeros(64):b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1;
bits(datasize) operand3;
bits(datasize) operand4 = V[n];

operand1 = V[d];
operand3 = NOT(V[m]);

V[d] = operand1 EOR ((operand1 EOR operand4) AND operand3);

Supported architectures

v7/A32/A64

int8x8_t vtbx4_s8 (int8x8_t a, int8x8x4_t b, int8x8_t c)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → b.val[3]:b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8_t vtbx4_u8 (uint8x8_t a, uint8x8x4_t b, uint8x8_t c)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → b.val[3]:b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vtbx4_p8 (poly8x8_t a, poly8x8x4_t b, uint8x8_t c)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B,Vn+1.16B},Vm.8B

Argument Preparation

a → Vd 

Vn → b.val[1]:b.val[0] 

Vn+1 → b.val[3]:b.val[2] 

c → Vm

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8_t vqtbl1_s8 (int8x16_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbl1q_s8 (int8x16_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbl1_u8 (uint8x16_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbl1q_u8 (uint8x16_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbl1_p8 (poly8x16_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbl1q_p8 (poly8x16_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbx1_s8 (int8x8_t a, int8x16_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbx1q_s8 (int8x16_t a, int8x16_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbx1_u8 (uint8x8_t a, uint8x16_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbx1q_u8 (uint8x16_t a, uint8x16_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbx1_p8 (poly8x8_t a, poly8x16_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t → Vn.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbx1q_p8 (poly8x16_t a, poly8x16_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t → Vn.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbl2_s8 (int8x16x2_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbl2q_s8 (int8x16x2_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbl2_u8 (uint8x16x2_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbl2q_u8 (uint8x16x2_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbl2_p8 (poly8x16x2_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbl2q_p8 (poly8x16x2_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbl3_s8 (int8x16x3_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbl3q_s8 (int8x16x3_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbl3_u8 (uint8x16x3_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbl3q_u8 (uint8x16x3_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbl3_p8 (poly8x16x3_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbl3q_p8 (poly8x16x3_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbl4_s8 (int8x16x4_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbl4q_s8 (int8x16x4_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbl4_u8 (uint8x16x4_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbl4q_u8 (uint8x16x4_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbl4_p8 (poly8x16x4_t t, uint8x8_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbl4q_p8 (poly8x16x4_t t, uint8x16_t idx)Table vector lookup

Description

A64 Instruction

TBL Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbx2_s8 (int8x8_t a, int8x16x2_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbx2q_s8 (int8x16_t a, int8x16x2_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbx2_u8 (uint8x8_t a, uint8x16x2_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbx2q_u8 (uint8x16_t a, uint8x16x2_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbx2_p8 (poly8x8_t a, poly8x16x2_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+1.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbx2q_p8 (poly8x16_t a, poly8x16x2_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+1.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbx3_s8 (int8x8_t a, int8x16x3_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbx3q_s8 (int8x16_t a, int8x16x3_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbx3_u8 (uint8x8_t a, uint8x16x3_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbx3q_u8 (uint8x16_t a, uint8x16x3_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbx3_p8 (poly8x8_t a, poly8x16x3_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+2.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbx3q_p8 (poly8x16_t a, poly8x16x3_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+2.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x8_t vqtbx4_s8 (int8x8_t a, int8x16x4_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

int8x16_t vqtbx4q_s8 (int8x16_t a, int8x16x4_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x8_t vqtbx4_u8 (uint8x8_t a, uint8x16x4_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8x16_t vqtbx4q_u8 (uint8x16_t a, uint8x16x4_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x8_t vqtbx4_p8 (poly8x8_t a, poly8x16x4_t t, uint8x8_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.8B,{Vn.16B - Vn+3.16B},Vm.8B

Argument Preparation

a → Vd.8B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.8B

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

poly8x16_t vqtbx4q_p8 (poly8x16_t a, poly8x16x4_t t, uint8x16_t idx)Table vector lookup extension

Description

A64 Instruction

TBX Vd.16B,{Vn.16B - Vn+3.16B},Vm.16B

Argument Preparation

a → Vd.16B 

t.val[0] → Vn.16B 

t.val[1] → Vn+1.16B 

t.val[2] → Vn+2.16B 

t.val[3] → Vn+3.16B 

idx → Vm.16B

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) indices = V[m];
bits(128*regs) table = Zeros();
bits(datasize) result;
integer index;

// Create table from registers
for i = 0 to regs-1
    table<128*i+127:128*i> = V[n];
    n = (n + 1) MOD 32;

result = if is_tbl then Zeros() else V[d];
for i = 0 to elements-1
    index = UInt(Elem[indices, i, 8]);
    if index < 16 * regs then
        Elem[result, i, 8] = Elem[table, index, 8];

V[d] = result;

Supported architectures

A64

uint8_t vget_lane_u8 (uint8x8_t v, const int lane)Unsigned move vector element to general-purpose register

Description

Unsigned Move vector element to general-purpose register. This instruction reads the unsigned integer from the source SIMD&FP register, zero-extends it to form a 32-bit or 64-bit value, and writes the result to the destination general-purpose register.

A64 Instruction

UMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.8B 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint16_t vget_lane_u16 (uint16x4_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.4H 

0 << lane << 3

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint32_t vget_lane_u32 (uint32x2_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.S[lane]

Argument Preparation

v → Vn.2S 

0 << lane << 1

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint64_t vget_lane_u64 (uint64x1_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.1D 

0 << lane << 0

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly64_t vget_lane_p64 (poly64x1_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.1D 

0 << lane << 0

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

A32/A64

int8_t vget_lane_s8 (int8x8_t v, const int lane)Signed move vector element to general-purpose register

Description

Signed Move vector element to general-purpose register. This instruction reads the signed integer from the source SIMD&FP register, sign-extends it to form a 32-bit or 64-bit value, and writes the result to destination general-purpose register.

A64 Instruction

SMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.8B 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int16_t vget_lane_s16 (int16x4_t v, const int lane)Signed move vector element to general-purpose register

Description

A64 Instruction

SMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.4H 

0 << lane << 3

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int32_t vget_lane_s32 (int32x2_t v, const int lane)Signed move vector element to general-purpose register

Description

A64 Instruction

SMOV Rd,Vn.S[lane]

Argument Preparation

v → Vn.2S 

0 << lane << 1

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int64_t vget_lane_s64 (int64x1_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.1D 

0 << lane << 0

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly8_t vget_lane_p8 (poly8x8_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.8B 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly16_t vget_lane_p16 (poly16x4_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.4H 

0 << lane << 3

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

float32_t vget_lane_f32 (float32x2_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

v → Vn.2S 

0 << lane << 1

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64_t vget_lane_f64 (float64x1_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

v → Vn.1D 

0 << lane << 0

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8_t vgetq_lane_u8 (uint8x16_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.16B 

0 << lane << 15

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint16_t vgetq_lane_u16 (uint16x8_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.8H 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint32_t vgetq_lane_u32 (uint32x4_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.S[lane]

Argument Preparation

v → Vn.4S 

0 << lane << 3

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

uint64_t vgetq_lane_u64 (uint64x2_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.2D 

0 << lane << 1

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly64_t vgetq_lane_p64 (poly64x2_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.2D 

0 << lane << 1

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

A32/A64

int8_t vgetq_lane_s8 (int8x16_t v, const int lane)Signed move vector element to general-purpose register

Description

A64 Instruction

SMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.16B 

0 << lane << 15

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int16_t vgetq_lane_s16 (int16x8_t v, const int lane)Signed move vector element to general-purpose register

Description

A64 Instruction

SMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.8H 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int32_t vgetq_lane_s32 (int32x4_t v, const int lane)Signed move vector element to general-purpose register

Description

A64 Instruction

SMOV Rd,Vn.S[lane]

Argument Preparation

v → Vn.4S 

0 << lane << 3

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = SignExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

int64_t vgetq_lane_s64 (int64x2_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.D[lane]

Argument Preparation

v → Vn.2D 

0 << lane << 1

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly8_t vgetq_lane_p8 (poly8x16_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.B[lane]

Argument Preparation

v → Vn.16B 

0 << lane << 15

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

poly16_t vgetq_lane_p16 (poly16x8_t v, const int lane)Unsigned move vector element to general-purpose register

Description

A64 Instruction

UMOV Rd,Vn.H[lane]

Argument Preparation

v → Vn.8H 

0 << lane << 7

Results

Rd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(idxdsize) operand = V[n];

X[d] = ZeroExtend(Elem[operand, index, esize], datasize);

Supported architectures

v7/A32/A64

float16_t vget_lane_f16 (float16x4_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

v → Vn.4H 

0 << lane << 3

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float16_t vgetq_lane_f16 (float16x8_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Hd,Vn.H[lane]

Argument Preparation

v → Vn.8H 

0 << lane << 7

Results

Hd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32_t vgetq_lane_f32 (float32x4_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Sd,Vn.S[lane]

Argument Preparation

v → Vn.4S 

0 << lane << 3

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64_t vgetq_lane_f64 (float64x2_t v, const int lane)Duplicate general-purpose register to vector

Description

A64 Instruction

DUP Dd,Vn.D[lane]

Argument Preparation

v → Vn.2D 

0 << lane << 1

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(datasize) result;

for e = 0 to elements-1
    Elem[result, e, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x8_t vset_lane_u8 (uint8_t a, uint8x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x4_t vset_lane_u16 (uint16_t a, uint16x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x2_t vset_lane_u32 (uint32_t a, uint32x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x1_t vset_lane_u64 (uint64_t a, uint64x1_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.1D 

0 << lane << 0

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x1_t vset_lane_p64 (poly64_t a, poly64x1_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.1D 

0 << lane << 0

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

int8x8_t vset_lane_s8 (int8_t a, int8x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vset_lane_s16 (int16_t a, int16x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x2_t vset_lane_s32 (int32_t a, int32x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x1_t vset_lane_s64 (int64_t a, int64x1_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.1D 

0 << lane << 0

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x8_t vset_lane_p8 (poly8_t a, poly8x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8B 

0 << lane << 7

Results

Vd.8B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x4_t vset_lane_p16 (poly16_t a, poly16x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float16x4_t vset_lane_f16 (float16_t a, float16x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Vn.H[0]

Argument Preparation

a → VnH 

v → Vd.4H 

0 << lane << 3

Results

Vd.4H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float16x8_t vsetq_lane_f16 (float16_t a, float16x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Vn.H[0]

Argument Preparation

a → VnH 

v → Vd.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vset_lane_f32 (float32_t a, float32x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2S 

0 << lane << 1

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x1_t vset_lane_f64 (float64_t a, float64x1_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.1D 

0 << lane << 0

Results

Vd.1D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

uint8x16_t vsetq_lane_u8 (uint8_t a, uint8x16_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint16x8_t vsetq_lane_u16 (uint16_t a, uint16x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint32x4_t vsetq_lane_u32 (uint32_t a, uint32x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

uint64x2_t vsetq_lane_u64 (uint64_t a, uint64x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly64x2_t vsetq_lane_p64 (poly64_t a, poly64x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A32/A64

int8x16_t vsetq_lane_s8 (int8_t a, int8x16_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int16x8_t vsetq_lane_s16 (int16_t a, int16x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int32x4_t vsetq_lane_s32 (int32_t a, int32x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

int64x2_t vsetq_lane_s64 (int64_t a, int64x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly8x16_t vsetq_lane_p8 (poly8_t a, poly8x16_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.B[lane],Rn

Argument Preparation

a → Rn 

v → Vd.16B 

0 << lane << 15

Results

Vd.16B → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

poly16x8_t vsetq_lane_p16 (poly16_t a, poly16x8_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.H[lane],Rn

Argument Preparation

a → Rn 

v → Vd.8H 

0 << lane << 7

Results

Vd.8H → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vsetq_lane_f32 (float32_t a, float32x4_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.S[lane],Rn

Argument Preparation

a → Rn 

v → Vd.4S 

0 << lane << 3

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

v7/A32/A64

float64x2_t vsetq_lane_f64 (float64_t a, float64x2_t v, const int lane)Insert vector element from general-purpose register

Description

Insert vector element from general-purpose register. This instruction copies the contents of the source general-purpose register to the specified vector element in the destination SIMD&FP register.

A64 Instruction

INS Vd.D[lane],Rn

Argument Preparation

a → Rn 

v → Vd.2D 

0 << lane << 1

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(esize) element = X[n];
bits(128) result;

result = V[d];
Elem[result, index, esize] = element;
V[d] = result;

Supported architectures

A64

float32_t vrecpxs_f32 (float32_t a)Floating-point reciprocal exponent

Description

Floating-point Reciprocal exponent (scalar). This instruction finds an approximate reciprocal exponent for each vector element in the source SIMD&FP register, places the result in a vector, and writes the vector to the destination SIMD&FP register.

A64 Instruction

FRECPX Sd,Sn

Argument Preparation

a → Sn

Results

Sd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecpX(element, FPCR);

V[d] = result;

Supported architectures

A64

float64_t vrecpxd_f64 (float64_t a)Floating-point reciprocal exponent

Description

A64 Instruction

FRECPX Dd,Dn

Argument Preparation

a → Dn

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand = V[n];
bits(datasize) result;
bits(esize) element;

for e = 0 to elements-1
    element = Elem[operand, e, esize];
    Elem[result, e, esize] = FPRecpX(element, FPCR);

V[d] = result;

Supported architectures

A64

float32x2_t vfma_n_f32 (float32x2_t a, float32x2_t b, float32_t n)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

n → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x4_t vfmaq_n_f32 (float32x4_t a, float32x4_t b, float32_t n)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

n → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

v7/A32/A64

float32x2_t vfms_n_f32 (float32x2_t a, float32x2_t b, float32_t n)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2S,Vn.2S,Vm.S[0]

Argument Preparation

a → Vd.2S 

b → Vn.2S 

n → Vm.S[0]

Results

Vd.2S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float32x4_t vfmsq_n_f32 (float32x4_t a, float32x4_t b, float32_t n)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.4S,Vn.4S,Vm.S[0]

Argument Preparation

a → Vd.4S 

b → Vn.4S 

n → Vm.S[0]

Results

Vd.4S → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfma_n_f64 (float64x1_t a, float64x1_t b, float64_t n)Floating-point fused multiply-add

Description

A64 Instruction

FMADD Dd,Dn,Dm,Da

Argument Preparation

a → Da 

b → Dn 

n → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) result;
bits(datasize) operanda = V[a];
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];

result = FPMulAdd(operanda, operand1, operand2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmaq_n_f64 (float64x2_t a, float64x2_t b, float64_t n)Floating-point fused multiply-add to accumulator

Description

A64 Instruction

FMLA Vd.2D,Vn.2D,Vm.D[0]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

n → Vm.D[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

float64x1_t vfms_n_f64 (float64x1_t a, float64x1_t b, float64_t n)Floating-point fused multiply-subtract

Description

A64 Instruction

FMSUB Dd,Dn,Dm,Da

Argument Preparation

a → Da 

b → Dn 

n → Dm

Results

Dd → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) result;
bits(datasize) operanda = V[a];
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];

operand1 = FPNeg(operand1);
result = FPMulAdd(operanda, operand1, operand2, FPCR);

V[d] = result;

Supported architectures

A64

float64x2_t vfmsq_n_f64 (float64x2_t a, float64x2_t b, float64_t n)Floating-point fused multiply-subtract from accumulator

Description

A64 Instruction

FMLS Vd.2D,Vn.2D,Vm.D[0]

Argument Preparation

a → Vd.2D 

b → Vn.2D 

n → Vm.D[0]

Results

Vd.2D → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) operand3 = V[d];
bits(datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    if sub_op then element1 = FPNeg(element1);
    Elem[result, e, esize] = FPMulAdd(Elem[operand3, e, esize], element1, element2, FPCR);

V[d] = result;

Supported architectures

A64

int8x8x2_t vtrn_s8 (int8x8_t a, int8x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8B,Vn.8B,Vm.8B
TRN2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x4x2_t vtrn_s16 (int16x4_t a, int16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4H,Vn.4H,Vm.4H
TRN2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8x2_t vtrn_u8 (uint8x8_t a, uint8x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8B,Vn.8B,Vm.8B
TRN2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4x2_t vtrn_u16 (uint16x4_t a, uint16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4H,Vn.4H,Vm.4H
TRN2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8x2_t vtrn_p8 (poly8x8_t a, poly8x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8B,Vn.8B,Vm.8B
TRN2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x4x2_t vtrn_p16 (poly16x4_t a, poly16x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4H,Vn.4H,Vm.4H
TRN2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x2x2_t vtrn_s32 (int32x2_t a, int32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.2S,Vn.2S,Vm.2S
TRN2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x2x2_t vtrn_f32 (float32x2_t a, float32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.2S,Vn.2S,Vm.2S
TRN2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2x2_t vtrn_u32 (uint32x2_t a, uint32x2_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.2S,Vn.2S,Vm.2S
TRN2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int8x16x2_t vtrnq_s8 (int8x16_t a, int8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.16B,Vn.16B,Vm.16B
TRN2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x8x2_t vtrnq_s16 (int16x8_t a, int16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8H,Vn.8H,Vm.8H
TRN2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x4x2_t vtrnq_s32 (int32x4_t a, int32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4S,Vn.4S,Vm.4S
TRN2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x4x2_t vtrnq_f32 (float32x4_t a, float32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4S,Vn.4S,Vm.4S
TRN2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16x2_t vtrnq_u8 (uint8x16_t a, uint8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.16B,Vn.16B,Vm.16B
TRN2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8x2_t vtrnq_u16 (uint16x8_t a, uint16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8H,Vn.8H,Vm.8H
TRN2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4x2_t vtrnq_u32 (uint32x4_t a, uint32x4_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.4S,Vn.4S,Vm.4S
TRN2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16x2_t vtrnq_p8 (poly8x16_t a, poly8x16_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.16B,Vn.16B,Vm.16B
TRN2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8x2_t vtrnq_p16 (poly16x8_t a, poly16x8_t b)Transpose vectors

Description

A64 Instruction

TRN1 Vd1.8H,Vn.8H,Vm.8H
TRN2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, 2*p+part, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, 2*p+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8x2_t vzip_s8 (int8x8_t a, int8x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8B,Vn.8B,Vm.8B
ZIP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x4x2_t vzip_s16 (int16x4_t a, int16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4H,Vn.4H,Vm.4H
ZIP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8x2_t vzip_u8 (uint8x8_t a, uint8x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8B,Vn.8B,Vm.8B
ZIP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4x2_t vzip_u16 (uint16x4_t a, uint16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4H,Vn.4H,Vm.4H
ZIP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8x2_t vzip_p8 (poly8x8_t a, poly8x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8B,Vn.8B,Vm.8B
ZIP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x4x2_t vzip_p16 (poly16x4_t a, poly16x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4H,Vn.4H,Vm.4H
ZIP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x2x2_t vzip_s32 (int32x2_t a, int32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.2S,Vn.2S,Vm.2S
ZIP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x2x2_t vzip_f32 (float32x2_t a, float32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.2S,Vn.2S,Vm.2S
ZIP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2x2_t vzip_u32 (uint32x2_t a, uint32x2_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.2S,Vn.2S,Vm.2S
ZIP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int8x16x2_t vzipq_s8 (int8x16_t a, int8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.16B,Vn.16B,Vm.16B
ZIP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x8x2_t vzipq_s16 (int16x8_t a, int16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8H,Vn.8H,Vm.8H
ZIP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x4x2_t vzipq_s32 (int32x4_t a, int32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4S,Vn.4S,Vm.4S
ZIP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x4x2_t vzipq_f32 (float32x4_t a, float32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4S,Vn.4S,Vm.4S
ZIP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16x2_t vzipq_u8 (uint8x16_t a, uint8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.16B,Vn.16B,Vm.16B
ZIP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8x2_t vzipq_u16 (uint16x8_t a, uint16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8H,Vn.8H,Vm.8H
ZIP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4x2_t vzipq_u32 (uint32x4_t a, uint32x4_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.4S,Vn.4S,Vm.4S
ZIP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16x2_t vzipq_p8 (poly8x16_t a, poly8x16_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.16B,Vn.16B,Vm.16B
ZIP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8x2_t vzipq_p16 (poly16x8_t a, poly16x8_t b)Zip vectors

Description

A64 Instruction

ZIP1 Vd1.8H,Vn.8H,Vm.8H
ZIP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = V[n];
bits(datasize) operand2 = V[m];
bits(datasize) result;

integer base = part * pairs;

for p = 0 to pairs-1
    Elem[result, 2*p+0, esize] = Elem[operand1, base+p, esize];
    Elem[result, 2*p+1, esize] = Elem[operand2, base+p, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int8x8x2_t vuzp_s8 (int8x8_t a, int8x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8B,Vn.8B,Vm.8B
UZP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x4x2_t vuzp_s16 (int16x4_t a, int16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4H,Vn.4H,Vm.4H
UZP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x2x2_t vuzp_s32 (int32x2_t a, int32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.2S,Vn.2S,Vm.2S
UZP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x2x2_t vuzp_f32 (float32x2_t a, float32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.2S,Vn.2S,Vm.2S
UZP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x8x2_t vuzp_u8 (uint8x8_t a, uint8x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8B,Vn.8B,Vm.8B
UZP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x4x2_t vuzp_u16 (uint16x4_t a, uint16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4H,Vn.4H,Vm.4H
UZP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x2x2_t vuzp_u32 (uint32x2_t a, uint32x2_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.2S,Vn.2S,Vm.2S
UZP2 Vd2.2S,Vn.2S,Vm.2S

Argument Preparation

a → Vn.2S 

b → Vm.2S

Results

Vd1.2S → result.val[0]
Vd2.2S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x8x2_t vuzp_p8 (poly8x8_t a, poly8x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8B,Vn.8B,Vm.8B
UZP2 Vd2.8B,Vn.8B,Vm.8B

Argument Preparation

a → Vn.8B 

b → Vm.8B

Results

Vd1.8B → result.val[0]
Vd2.8B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x4x2_t vuzp_p16 (poly16x4_t a, poly16x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4H,Vn.4H,Vm.4H
UZP2 Vd2.4H,Vn.4H,Vm.4H

Argument Preparation

a → Vn.4H 

b → Vm.4H

Results

Vd1.4H → result.val[0]
Vd2.4H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int8x16x2_t vuzpq_s8 (int8x16_t a, int8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.16B,Vn.16B,Vm.16B
UZP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x8x2_t vuzpq_s16 (int16x8_t a, int16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8H,Vn.8H,Vm.8H
UZP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int32x4x2_t vuzpq_s32 (int32x4_t a, int32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4S,Vn.4S,Vm.4S
UZP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

float32x4x2_t vuzpq_f32 (float32x4_t a, float32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4S,Vn.4S,Vm.4S
UZP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint8x16x2_t vuzpq_u8 (uint8x16_t a, uint8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.16B,Vn.16B,Vm.16B
UZP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint16x8x2_t vuzpq_u16 (uint16x8_t a, uint16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8H,Vn.8H,Vm.8H
UZP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

uint32x4x2_t vuzpq_u32 (uint32x4_t a, uint32x4_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.4S,Vn.4S,Vm.4S
UZP2 Vd2.4S,Vn.4S,Vm.4S

Argument Preparation

a → Vn.4S 

b → Vm.4S

Results

Vd1.4S → result.val[0]
Vd2.4S → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly8x16x2_t vuzpq_p8 (poly8x16_t a, poly8x16_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.16B,Vn.16B,Vm.16B
UZP2 Vd2.16B,Vn.16B,Vm.16B

Argument Preparation

a → Vn.16B 

b → Vm.16B

Results

Vd1.16B → result.val[0]
Vd2.16B → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

poly16x8x2_t vuzpq_p16 (poly16x8_t a, poly16x8_t b)Unzip vectors

Description

A64 Instruction

UZP1 Vd1.8H,Vn.8H,Vm.8H
UZP2 Vd2.8H,Vn.8H,Vm.8H

Argument Preparation

a → Vn.8H 

b → Vm.8H

Results

Vd1.8H → result.val[0]
Vd2.8H → result.val[1]

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operandl = V[n];
bits(datasize) operandh = V[m];
bits(datasize) result;

bits(datasize*2) zipped = operandh:operandl;
for e = 0 to elements-1
    Elem[result, e, esize] = Elem[zipped, 2*e+part, esize];

V[d] = result;

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_s8 (int8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_s16 (int16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_s32 (int32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A32/A64

poly64x1_t vreinterpret_p64_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

float16x4_t vreinterpret_f16_f32 (float32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_u8 (uint8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_u16 (uint16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_u32 (uint32x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2S

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_p8 (poly8x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8B

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_p16 (poly16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_u64 (uint64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

uint64x1_t vreinterpret_u64_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A32/A64

float16x4_t vreinterpret_f16_s64 (int64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int8x8_t vreinterpret_s8_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

int16x4_t vreinterpret_s16_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

int32x2_t vreinterpret_s32_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

float32x2_t vreinterpret_f32_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

uint8x8_t vreinterpret_u8_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

uint16x4_t vreinterpret_u16_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint32x2_t vreinterpret_u32_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.2S → result

Supported architectures

v7/A32/A64

poly8x8_t vreinterpret_p8_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.8B → result

Supported architectures

v7/A32/A64

poly16x4_t vreinterpret_p16_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.4H → result

Supported architectures

v7/A32/A64

uint64x1_t vreinterpret_u64_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

int64x1_t vreinterpret_s64_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

v7/A32/A64

float64x1_t vreinterpret_f64_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A64

poly64x1_t vreinterpret_p64_f16 (float16x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4H

Results

Vd.1D → result

Supported architectures

A32/A64

int16x8_t vreinterpretq_s16_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_s8 (int8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_s16 (int16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_s32 (int32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.1Q → result

Supported architectures

A32/A64

poly64x2_t vreinterpretq_p64_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A64

poly128_t vreinterpretq_p128_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A64

float16x8_t vreinterpretq_f16_f32 (float32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_u8 (uint8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_u16 (uint16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_u32 (uint32x4_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.4S

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_p8 (poly8x16_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.16B

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.1Q → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_p16 (poly16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A32/A64

poly64x2_t vreinterpretq_p64_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_u64 (uint64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A32/A64

float16x8_t vreinterpretq_f16_s64 (int64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int8x16_t vreinterpretq_s8_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

int16x8_t vreinterpretq_s16_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

int32x4_t vreinterpretq_s32_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

float32x4_t vreinterpretq_f32_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

uint8x16_t vreinterpretq_u8_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

uint16x8_t vreinterpretq_u16_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint32x4_t vreinterpretq_u32_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.4S → result

Supported architectures

v7/A32/A64

poly8x16_t vreinterpretq_p8_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.16B → result

Supported architectures

v7/A32/A64

poly16x8_t vreinterpretq_p16_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.8H → result

Supported architectures

v7/A32/A64

uint64x2_t vreinterpretq_u64_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

int64x2_t vreinterpretq_s64_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

v7/A32/A64

float64x2_t vreinterpretq_f64_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A64

poly64x2_t vreinterpretq_p64_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.2D → result

Supported architectures

A32/A64

poly128_t vreinterpretq_p128_f16 (float16x8_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.8H

Results

Vd.1Q → result

Supported architectures

A32/A64

int8x8_t vreinterpret_s8_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A64

int16x4_t vreinterpret_s16_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A64

int32x2_t vreinterpret_s32_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

A64

uint8x8_t vreinterpret_u8_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A64

uint16x4_t vreinterpret_u16_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A64

uint32x2_t vreinterpret_u32_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

A64

poly8x8_t vreinterpret_p8_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A64

poly16x4_t vreinterpret_p16_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A64

uint64x1_t vreinterpret_u64_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

int64x1_t vreinterpret_s64_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

float16x4_t vreinterpret_f16_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A64

float32x2_t vreinterpret_f32_f64 (float64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

A64

int8x16_t vreinterpretq_s8_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A64

int16x8_t vreinterpretq_s16_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A64

int32x4_t vreinterpretq_s32_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

A64

uint8x16_t vreinterpretq_u8_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A64

uint16x8_t vreinterpretq_u16_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A64

uint32x4_t vreinterpretq_u32_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

A64

poly8x16_t vreinterpretq_p8_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A64

poly16x8_t vreinterpretq_p16_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A64

uint64x2_t vreinterpretq_u64_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A64

int64x2_t vreinterpretq_s64_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A64

float16x8_t vreinterpretq_f16_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A64

float32x4_t vreinterpretq_f32_f64 (float64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

A64

int8x8_t vreinterpret_s8_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A32/A64

int16x4_t vreinterpret_s16_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A32/A64

int32x2_t vreinterpret_s32_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

A32/A64

uint8x8_t vreinterpret_u8_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A32/A64

uint16x4_t vreinterpret_u16_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A32/A64

uint32x2_t vreinterpret_u32_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.2S → result

Supported architectures

A32/A64

poly8x8_t vreinterpret_p8_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.8B → result

Supported architectures

A32/A64

poly16x4_t vreinterpret_p16_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A32/A64

uint64x1_t vreinterpret_u64_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A32/A64

int64x1_t vreinterpret_s64_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A32/A64

float64x1_t vreinterpret_f64_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.1D → result

Supported architectures

A64

float16x4_t vreinterpret_f16_p64 (poly64x1_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1D

Results

Vd.4H → result

Supported architectures

A32/A64

int8x16_t vreinterpretq_s8_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A32/A64

int16x8_t vreinterpretq_s16_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A32/A64

int32x4_t vreinterpretq_s32_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

A32/A64

uint8x16_t vreinterpretq_u8_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A32/A64

uint16x8_t vreinterpretq_u16_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A32/A64

uint32x4_t vreinterpretq_u32_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.4S → result

Supported architectures

A32/A64

poly8x16_t vreinterpretq_p8_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.16B → result

Supported architectures

A32/A64

poly16x8_t vreinterpretq_p16_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A32/A64

uint64x2_t vreinterpretq_u64_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A32/A64

int64x2_t vreinterpretq_s64_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A32/A64

float64x2_t vreinterpretq_f64_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.2D → result

Supported architectures

A64

float16x8_t vreinterpretq_f16_p64 (poly64x2_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.2D

Results

Vd.8H → result

Supported architectures

A32/A64

int8x16_t vreinterpretq_s8_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.16B → result

Supported architectures

A32/A64

int16x8_t vreinterpretq_s16_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.8H → result

Supported architectures

A32/A64

int32x4_t vreinterpretq_s32_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.4S → result

Supported architectures

A32/A64

uint8x16_t vreinterpretq_u8_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.16B → result

Supported architectures

A32/A64

uint16x8_t vreinterpretq_u16_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.8H → result

Supported architectures

A32/A64

uint32x4_t vreinterpretq_u32_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.4S → result

Supported architectures

A32/A64

poly8x16_t vreinterpretq_p8_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.16B → result

Supported architectures

A32/A64

poly16x8_t vreinterpretq_p16_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.8H → result

Supported architectures

A32/A64

uint64x2_t vreinterpretq_u64_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A32/A64

int64x2_t vreinterpretq_s64_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A32/A64

float64x2_t vreinterpretq_f64_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.2D → result

Supported architectures

A64

float16x8_t vreinterpretq_f16_p128 (poly128_t a)Vector reinterpret cast operation

Description

A64 Instruction

NOP

Argument Preparation

a → Vd.1Q

Results

Vd.8H → result

Supported architectures

A32/A64

poly128_t vldrq_p128 (poly128_t const * ptr)Load SIMD&FP register

Description

Load SIMD&FP Register (register offset). This instruction loads a SIMD&FP register from memory. The address that is used for the load is calculated from a base register value and an offset register value. The offset can be optionally shifted and extended.

A64 Instruction

LDR Qd,[Xn]

Argument Preparation

ptr → Xn

Results

Qd → result

Operation

bits(64) offset = ExtendReg(m, extend_type, shift);
if HaveMTEExt() then
    boolean is_load_store = memop IN {MemOp_STORE, MemOp_LOAD};
    SetNotTagCheckedInstruction(is_load_store && n == 31);

CheckFPAdvSIMDEnabled64();
bits(64) address;
bits(datasize) data;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

address = address + offset;

case memop of
    when MemOp_STORE
        data = V[t];
        Mem[address, datasize DIV 8, AccType_VEC] = data;

    when MemOp_LOAD
        data = Mem[address, datasize DIV 8, AccType_VEC];
        V[t] = data;

Supported architectures

A32/A64

void vstrq_p128 (poly128_t * ptr, poly128_t val)Store SIMD&FP register

Description

Store SIMD&FP register (register offset). This instruction stores a single SIMD&FP register to memory. The address that is used for the store is calculated from a base register value and an offset register value. The offset can be optionally shifted and extended.

A64 Instruction

STR Qt,[Xn]

Argument Preparation

ptr → Xn 

val → Qt

Results

void → result

Operation

bits(64) offset = ExtendReg(m, extend_type, shift);
if HaveMTEExt() then
    boolean is_load_store = memop IN {MemOp_STORE, MemOp_LOAD};
    SetNotTagCheckedInstruction(is_load_store && n == 31);

CheckFPAdvSIMDEnabled64();
bits(64) address;
bits(datasize) data;

if n == 31 then
    CheckSPAlignment();
    address = SP[];
else
    address = X[n];

address = address + offset;

case memop of
    when MemOp_STORE
        data = V[t];
        Mem[address, datasize DIV 8, AccType_VEC] = data;

    when MemOp_LOAD
        data = Mem[address, datasize DIV 8, AccType_VEC];
        V[t] = data;

Supported architectures

A32/A64

uint8x16_t vaeseq_u8 (uint8x16_t data, uint8x16_t key)AES single round encryption

Description

AES single round encryption.

A64 Instruction

AESE Vd.16B,Vn.16B

Argument Preparation

data → Vd.16B 

key → Vn.16B

Results

Vd.16B → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) result;
result = operand1 EOR operand2;
result = AESSubBytes(AESShiftRows(result));

V[d] = result;

Supported architectures

A32/A64

uint8x16_t vaesdq_u8 (uint8x16_t data, uint8x16_t key)AES single round decryption

Description

AES single round decryption.

A64 Instruction

AESD Vd.16B,Vn.16B

Argument Preparation

data → Vd.16B 

key → Vn.16B

Results

Vd.16B → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) result;
result = operand1 EOR operand2;
result = AESInvSubBytes(AESInvShiftRows(result));
V[d] = result;

Supported architectures

A32/A64

uint8x16_t vaesmcq_u8 (uint8x16_t data)AES mix columns

Description

AES mix columns.

A64 Instruction

AESMC Vd.16B,Vn.16B

Argument Preparation

data → Vn.16B

Results

Vd.16B → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand = V[n];
bits(128) result;
result = AESMixColumns(operand);
V[d] = result;

Supported architectures

A32/A64

uint8x16_t vaesimcq_u8 (uint8x16_t data)AES inverse mix columns

Description

AES inverse mix columns.

A64 Instruction

AESIMC Vd.16B,Vn.16B

Argument Preparation

data → Vn.16B

Results

Vd.16B → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand = V[n];
bits(128) result;
result = AESInvMixColumns(operand);
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha1cq_u32 (uint32x4_t hash_abcd, uint32_t hash_e, uint32x4_t wk)SHA1 hash update (choose)

Description

SHA1 hash update (choose).

A64 Instruction

SHA1C Qd,Sn,Vm.4S

Argument Preparation

hash_abcd → Qd 

hash_e → Sn 

wk → Vm.4S

Results

Qd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) X = V[d];
bits(32) Y = V[n];    // Note: 32 not 128 bits wide
bits(128) W = V[m];
bits(32) t;

for e = 0 to 3
    t = SHAchoose(X<63:32>, X<95:64>, X<127:96>);
    Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32];
    X<63:32> = ROL(X<63:32>, 30);
    <Y, X> = ROL(Y:X, 32);
V[d] = X;

Supported architectures

A32/A64

uint32x4_t vsha1pq_u32 (uint32x4_t hash_abcd, uint32_t hash_e, uint32x4_t wk)SHA1 hash update (parity)

Description

SHA1 hash update (parity).

A64 Instruction

SHA1P Qd,Sn,Vm.4S

Argument Preparation

hash_abcd → Qd 

hash_e → Sn 

wk → Vm.4S

Results

Qd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) X = V[d];
bits(32) Y = V[n];    // Note: 32 not 128 bits wide
bits(128) W = V[m];
bits(32) t;

for e = 0 to 3
    t = SHAparity(X<63:32>, X<95:64>, X<127:96>);
    Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32];
    X<63:32> = ROL(X<63:32>, 30);
    <Y, X> = ROL(Y:X, 32);
V[d] = X;

Supported architectures

A32/A64

uint32x4_t vsha1mq_u32 (uint32x4_t hash_abcd, uint32_t hash_e, uint32x4_t wk)SHA1 hash update (majority)

Description

SHA1 hash update (majority).

A64 Instruction

SHA1M Qd,Sn,Vm.4S

Argument Preparation

hash_abcd → Qd 

hash_e → Sn 

wk → Vm.4S

Results

Qd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) X = V[d];
bits(32) Y = V[n];    // Note: 32 not 128 bits wide
bits(128) W = V[m];
bits(32) t;

for e = 0 to 3
    t = SHAmajority(X<63:32>, X<95:64>, X<127:96>);
    Y = Y + ROL(X<31:0>, 5) + t + Elem[W, e, 32];
    X<63:32> = ROL(X<63:32>, 30);
    <Y, X> = ROL(Y:X, 32);
V[d] = X;

Supported architectures

A32/A64

uint32_t vsha1h_u32 (uint32_t hash_e)SHA1 fixed rotate

Description

SHA1 fixed rotate.

A64 Instruction

SHA1H Sd,Sn

Argument Preparation

hash_e → Sn

Results

Sd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(32) operand = V[n];    // read element [0] only,  [1-3] zeroed
V[d] = ROL(operand, 30);

Supported architectures

A32/A64

uint32x4_t vsha1su0q_u32 (uint32x4_t w0_3, uint32x4_t w4_7, uint32x4_t w8_11)SHA1 schedule update 0

Description

SHA1 schedule update 0.

A64 Instruction

SHA1SU0 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

w0_3 → Vd.4S 

w4_7 → Vn.4S 

w8_11 → Vm.4S

Results

Vd.4S → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) operand3 = V[m];
bits(128) result;

result = operand2<63:0>:operand1<127:64>;
result = result EOR operand1 EOR operand3;
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha1su1q_u32 (uint32x4_t tw0_3, uint32x4_t w12_15)SHA1 schedule update 1

Description

SHA1 schedule update 1.

A64 Instruction

SHA1SU1 Vd.4S,Vn.4S

Argument Preparation

tw0_3 → Vd.4S 

w12_15 → Vn.4S

Results

Vd.4S → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) result;
bits(128) T = operand1 EOR LSR(operand2, 32);
result<31:0> = ROL(T<31:0>, 1);
result<63:32> = ROL(T<63:32>, 1);
result<95:64> = ROL(T<95:64>, 1);
result<127:96> = ROL(T<127:96>, 1) EOR ROL(T<31:0>, 2);
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha256hq_u32 (uint32x4_t hash_abcd, uint32x4_t hash_efgh, uint32x4_t wk)SHA256 hash update (part 1)

Description

SHA256 hash update (part 1).

A64 Instruction

SHA256H Qd,Qn,Vm.4S

Argument Preparation

hash_abcd → Qd 

hash_efgh → Qn 

wk → Vm.4S

Results

Qd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) result;
result = SHA256hash(V[d], V[n], V[m], TRUE);
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha256h2q_u32 (uint32x4_t hash_efgh, uint32x4_t hash_abcd, uint32x4_t wk)SHA256 hash update (part 2)

Description

SHA256 hash update (part 2).

A64 Instruction

SHA256H2 Qd,Qn,Vm.4S

Argument Preparation

hash_efgh → Qd 

hash_abcd → Qn 

wk → Vm.4S

Results

Qd → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) result;
result = SHA256hash(V[n], V[d], V[m], FALSE);
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha256su0q_u32 (uint32x4_t w0_3, uint32x4_t w4_7)SHA256 schedule update 0

Description

SHA256 schedule update 0.

A64 Instruction

SHA256SU0 Vd.4S,Vn.4S

Argument Preparation

w0_3 → Vd.4S 

w4_7 → Vn.4S

Results

Vd.4S → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) result;
bits(128) T = operand2<31:0>:operand1<127:32>;
bits(32) elt;

for e = 0 to 3
    elt = Elem[T, e, 32];
    elt = ROR(elt, 7) EOR ROR(elt, 18) EOR LSR(elt, 3);
    Elem[result, e, 32] = elt + Elem[operand1, e, 32];
V[d] = result;

Supported architectures

A32/A64

uint32x4_t vsha256su1q_u32 (uint32x4_t tw0_3, uint32x4_t w8_11, uint32x4_t w12_15)SHA256 schedule update 1

Description

SHA256 schedule update 1.

A64 Instruction

SHA256SU1 Vd.4S,Vn.4S,Vm.4S

Argument Preparation

tw0_3 → Vd.4S 

w8_11 → Vn.4S 

w12_15 → Vm.4S

Results

Vd.4S → result

Operation

AArch64.CheckFPAdvSIMDEnabled();

bits(128) operand1 = V[d];
bits(128) operand2 = V[n];
bits(128) operand3 = V[m];
bits(128) result;
bits(128) T0 = operand3<31:0>:operand2<127:32>;
bits(64) T1;
bits(32) elt;

T1 = operand3<127:64>;
for e = 0 to 1
    elt = Elem[T1, e, 32];
    elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10);
    elt = elt + Elem[operand1, e, 32] + Elem[T0, e, 32];
    Elem[result, e, 32] = elt;

T1 = result<63:0>;
for e = 2 to 3
    elt = Elem[T1, e-2, 32];
    elt = ROR(elt, 17) EOR ROR(elt, 19) EOR LSR(elt, 10);
    elt = elt + Elem[operand1, e, 32] + Elem[T0, e, 32];
    Elem[result, e, 32] = elt;

V[d] = result;

Supported architectures

A32/A64

poly128_t vmull_p64 (poly64_t a, poly64_t b)Polynomial multiply long

Description

A64 Instruction

PMULL Vd.1Q,Vn.1D,Vm.1D

Argument Preparation

a → Vn.1D 

b → Vm.1D

Results

Vd.1Q → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, 2*esize] = PolynomialMult(element1, element2);

V[d] = result;

Supported architectures

A32/A64

poly128_t vmull_high_p64 (poly64x2_t a, poly64x2_t b)Polynomial multiply long

Description

A64 Instruction

PMULL2 Vd.1Q,Vn.2D,Vm.2D

Argument Preparation

a → Vn.2D 

b → Vm.2D

Results

Vd.1Q → result

Operation

CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(datasize) operand2 = Vpart[m, part];
bits(2*datasize) result;
bits(esize) element1;
bits(esize) element2;

for e = 0 to elements-1
    element1 = Elem[operand1, e, esize];
    element2 = Elem[operand2, e, esize];
    Elem[result, e, 2*esize] = PolynomialMult(element1, element2);

V[d] = result;

Supported architectures

A32/A64

uint32_t __crc32b (uint32_t a, uint8_t b)CRC32 checksum

Description

CRC32 checksum performs a cyclic redundancy check (CRC) calculation on a value held in a general-purpose register. It takes an input CRC value in the first source operand, performs a CRC on the input value in the second source operand, and returns the output CRC value. The second source operand can be 8, 16, 32, or 64 bits. To align with common usage, the bit order of the values is reversed as part of the operation, and the polynomial 0x04C11DB7 is used for the CRC calculation.

A64 Instruction

CRC32B Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x04C11DB7<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32h (uint32_t a, uint16_t b)CRC32 checksum

Description

A64 Instruction

CRC32H Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x04C11DB7<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32w (uint32_t a, uint32_t b)CRC32 checksum

Description

A64 Instruction

CRC32W Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x04C11DB7<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32d (uint32_t a, uint64_t b)CRC32 checksum

Description

A64 Instruction

CRC32X Wd,Wn,Xm

Argument Preparation

a → Wn 

b → Xm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x04C11DB7<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32cb (uint32_t a, uint8_t b)CRC32 checksum

Description

A64 Instruction

CRC32CB Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x1EDC6F41<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32ch (uint32_t a, uint16_t b)CRC32 checksum

Description

A64 Instruction

CRC32CH Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x1EDC6F41<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32cw (uint32_t a, uint32_t b)CRC32 checksum

Description

A64 Instruction

CRC32CW Wd,Wn,Wm

Argument Preparation

a → Wn 

b → Wm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x1EDC6F41<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64

uint32_t __crc32cd (uint32_t a, uint64_t b)CRC32 checksum

Description

A64 Instruction

CRC32CX Wd,Wn,Xm

Argument Preparation

a → Wn 

b → Xm

Results

Wd → result

Operation

bits(32) acc = X[n];    // accumulator
bits(size) val = X[m];    // input value
bits(32) poly = 0x1EDC6F41<31:0>;

bits(32+size) tempacc = BitReverse(acc):Zeros(size);
bits(size+32) tempval = BitReverse(val):Zeros(32);

// Poly32Mod2 on a bitstring does a polynomial Modulus over {0,1} operation
X[d] = BitReverse(Poly32Mod2(tempacc EOR tempval, poly));

Supported architectures

A32/A64