Newton-Raphson
Cube roots can be calculated using the Newton-Raphson method. This is an iterative process where an estimate is made of the solution and this is used to calculate a closer value. Visually it looks like this:
Mathematically, it is defined as follows:
Please enjoy the derivation below. It takes over 300 lines of MathML to render all that.
To find the cube root of N we are trying to solve:
Now, if:
Then:
The formula now becomes:
So, the final formula that will be implemented is:
Code Breakdown
The source code is shown below so this explanation will start at the do_calc function.
This part of the code loads N into d0 and an initial estimate of x into d1. It also initialises
a counter in x19 to report the number of iterations required to find the root.
do_calc:
ldr x0, =value // Get addr of test value
ldr d0, [x0] // d0 = N
fmov d2, #3.0 // Prepare for /3
fdiv d1, d0, d2 // d1 = d0 / d2. d1 = x(n)
mov x19, #0 // Initialise the counter
The next section of the code is the iteration loop.
It calculates 2xn and xn2.
do_loop:
fmov d3, #2.0 // Used for 2x(n)
fmul d2, d1, d3 // d2 has 2x(n)
fmul d3, d1, d1 // d3 has x(n) squared
At this point the registers d1 to d3 have all the values required for the calculation.
The calculation of xn+1 happens in the next lines.
fdiv d4, d0, d3 // d4 = N / x^2
fadd d4, d4, d2 // d4 = 2x + N / x^2
fmov d5, #3 // Prepare for /3
fdiv d4, d4, d5 // d4 = x(n+1)
d4 now contains the new estimate of the root. Finally, this is compared with the previous estimate and if they have converged the programme prints the value in d1 as the root. If d1 and d4 have not converged then the new estimate in d4 is moved to d1 and the process repeats until convergence.
fcmp d4, d1 // Have x(n) and x(n+1) converged?
beq print_result
fmov d1, d4 // Copy x(n+1) to x(n)
add x19, x19, #1 // Increment counter
b do_loop
Source Code
// cube_roots_2.s
.global main
.extern printf
.extern scanf
.data
input_msg:
.asciz "Input value: "
input_fmt:
.asciz "%lf"
result_output:
.asciz "Cube root of %lf is %lf.\n"
count_output:
.asciz "The calc took %d iterations.\n"
value:
.double 0 // 64 bit floating point
.text
main:
// prolog
stp x29, x30, [sp, -16]!
mov x29, sp
// main code
// Display input message
ldr x0, =input_msg
bl printf
ldr x0, =input_fmt
ldr x1, =value
bl scanf
do_calc:
ldr x0, =value // Get addr of test value
ldr d0, [x0] // d0 = N
fmov d2, #3.0 // Prepare for /3
fdiv d1, d0, d2 // d1 = d0 / d2. d1 = x(n)
mov x19, #0 // Initialise the counter
do_loop:
fmov d3, #2.0 // Used for 2x(n)
fmul d2, d1, d3 // d2 has 2x(n)
fmul d3, d1, d1 // d3 has x(n) squared
// d0 = N, d1 = x(n), d2 = 2x(n), d3 = x(n)^2
fdiv d4, d0, d3 // d4 = N / x^2
fadd d4, d4, d2 // d4 = 2x + N / x^2
fmov d5, #3 // Prepare for /3
fdiv d4, d4, d5 // d4 = x(n+1)
fcmp d4, d1 // Have x(n) and x(n+1) converged?
beq print_result
fmov d1, d4 // Copy x(n+1) to x(n)
add x19, x19, #1 // Increment counter
b do_loop
print_result:
ldr x0, =result_output
bl printf
ldr x0, =count_output
mov x1, x19
bl printf
// cleanup
mov x0, #0
ldp x29, x30, [sp], 16
RET