Estimating sin(θ)
There is a simple series for estimating sin(x).
The equation to use is:
For ease of implementation, this equation will be implemented as:
The value for x, theta, also needs to be in radians rather than degrees.
To accomplish this a value of π / 180 = 0.01745329252 will be used as a conversion from degrees to radians.
The final code, shown below, is 100 lines long so let's break it into smaller chunks for analysis.
Code Breakdown
The data sections
The .data section contains all the variables:
.data
output_fmt:
.asciz "sin(%lf) = %lf.\n" // Allow float for input
input_msg:
.asciz "Input angle in degrees: "
input_fmt:
.asciz "%lf" // Must be read as 64 bit value and stored in d register
value:
.double 0
to_radians:
.double 0.01745329252 // pi / 180
three_fact:
.double 6.0
five_fact:
.double 120.0
The points to note are:
- The output_fmt strings have %lf or long-floats as placeholders. These would have worked even if %f had been used.
- The input_fmt string has a %lf placeholder. This MUST be a long-float to ensure that it is read as a 64 bit, double precision value.
- The value variable is declared as type .double. This is a 64 bit float.
The values can be seen in the mathematical description at the top of the page.
The user input section
The next section prompts the user for an input angle, in degrees, takes that angle and stores it in the variable, value.
This has been declared as type .double so it will be stored as a 64 bit IEEE754 float.
// Input
ldr x0, =input_msg // Load input msg
bl printf // Display input msg
ldr x0, =input_fmt // Load input format (64 bit long float)
ldr x1, =value // Load address for double precision number
bl scanf // Read long float into 64 bit location
Setting up the registers
The next section set up all the registers for the calculation. The code will use d0 to d7. Each register has the
following purpose.
- d0: The initial angle given by the user. It is kept in d0 for printing at the end.
- d1: The first term in the equation. This is the angle converted to radians.
- d2: The second term in the equation.
- d3: The third term in the equation.
- d4: The conversion factor for converting degrees to radians.
- d5: The value of 6 or 3!
- d6: The value of 120 or 5!
- d7: This is used do store "x" during the x3 and x5 processes.
setup:
//Move all the values into position
ldr x0, =value
ldr d0, [x0] // Value, in degrees, in d0
ldr x0, =to_radians // 0.017453 for converting degrees to radians
ldr d4, [x0]
ldr x0, =three_fact // 3! = 6
ldr d5, [x0]
ldr x0, =five_fact // 5! = 120
ldr d6, [x0]
Now that all the values are in place the remainder of the code performs the calculations. There are three terms to be calculated and each is done in turn and stored in registers d1 - d3.
Calculating the first term
The first term is very easy. It's just the angle converted from degrees to radians.
term1:
fmul d1, d0, d4 // Convert deg to rad
Calculating the second term
The second term requires more steps:
term2:
fmul d2, d0, d4 // Convert deg to rad
setup_pow_3:
mov x10, #2 // Loop counter. one mult = sqrd, two = cubed
fmov d7, d2 // Copy d2 to d7. D7 will be the multiplier
do_pow_3:
fmul d2, d2, d7 // d2 = d2 * d7. Equiv to x = x^2
sub x10, x10, #1 // Decrement loop counter
cmp x10, #0 // Counter = 0?
beq div_6 // Yes, do division ...
b do_pow_3 // No, repeat multiplication
div_6:
fdiv d2, d2, d5 // Divide by 6
The first step is to convert the angle in degrees, currently in d0, to radians and store in d2.
The second step is to set up the third power calculation. Initially, the value in d2 is copied to d7. Then x10
is loaded with #2 as a counter for the number of multiplications that are required. The reason two is used is because
after the first muliplication, resulting in x2, the counter will be reduced by one to one. A comparison
with zero will fail and the multiplication will occur again resulting in x3. The counter will be reduced
by one again to zero and a comparison with zero will pass and the process will move on.
The final step is the divide by six calculation and the final result for the second term is now stored in d2.
Calculating the third term
This, the final term, is the same process as above except that the power is five, requiring x10 to be loaded with #4, and the final division is by 120.
term3:
fmul d3, d0, d4 // Convert deg to rad
setup_pow_5:
mov x10, #4
fmov d7, d3
do_pow_5:
fmul d3, d3, d7
sub x10, x10, #1
cmp x10, #0
beq div_120
b do_pow_5
div_120:
fdiv d3, d3, d6 // /120
Summing the terms
The final step is to sum the terms (actually it's term 1 - term 2 + term 3).
sum:
fsub d1, d1, d2
fadd d1, d1, d3
print:
ldr x0, =output_fmt
bl printf
Summary
And that is the project. The results are not accurate beyond the fourth, or possibly third, decimal place but the purpose was not to write an accurate calculation algorithm, but rather, to demonstrate the algorithm in ARM64 assembly.
Source code
// calc_sin.s
.data
output_fmt:
.asciz "sin(%f) = %f.\n" // Allow float for input
input_msg:
.asciz "Input angle in degrees: "
input_fmt:
.asciz "%lf" // Must be read as 64 bit value and stored in d register
value:
.double 0
to_radians:
.double 0.01745329252 // pi / 180
three_fact:
.double 6.0
five_fact:
.double 120.0
.global main
.extern scanf
.extern printf
.text
main:
// Prolog
stp x29, x30, [sp, -16]!
mov x29, sp
// Input
ldr x0, =input_msg // Load input msg
bl printf // Display input msg
ldr x0, =input_fmt // Load input format (64 bit long float)
ldr x1, =value // Load address for double precision number
bl scanf // Read long float into 64 bit location
setup:
//Move all the values into position
ldr x0, =value
ldr d0, [x0] // Value, in degrees, in d0
ldr x0, =to_radians // 0.017453 for converting degrees to radians
ldr d4, [x0]
ldr x0, =three_fact // 3! = 6
ldr d5, [x0]
ldr x0, =five_fact // 5! = 120
ldr d6, [x0]
term1:
fmul d1, d0, d4 // Convert deg to rad
term2:
fmul d2, d0, d4 // Convert deg to rad
setup_pow_3:
mov x10, #2 // Loop counter. one mult = sqrd, two = cubed
fmov d7, d2 // Copy d2 to d7. D7 will be the multiplier
do_pow_3:
fmul d2, d2, d7 // d2 = d2 * d7. Equiv to x = x^2
sub x10, x10, #1 // Decrement loop counter
cmp x10, #0 // Counter = 0?
beq div_6 // Yes, do division ...
b do_pow_3 // No, repeat multiplication
div_6:
fdiv d2, d2, d5 // Divide by 6
term3:
fmul d3, d0, d4 // Convert deg to rad
setup_pow_5:
mov x10, #4
fmov d7, d3
do_pow_5:
fmul d3, d3, d7
sub x10, x10, #1
cmp x10, #0
beq div_120
b do_pow_5
div_120:
fdiv d3, d3, d6 // /120
sum:
fsub d1, d1, d2
fadd d1, d1, d3
print:
ldr x0, =output_fmt
bl printf
// cleanup
mov x0, #0
ldp x29, x30, [sp], 16
RET