# 6502 assembly math: division – a simple algorithm using powers of two

There are several routines available on the web that perform integer division in 6502 assembly language. Still, I wanted to come up with my own solution. I needed a division routine to compute the slope m for my line drawing program.

I started to think of something very simple that I could implement in BASIC for testing purposes, then translate into machine language.

The simplest way to perform division is to use subtraction. For instance, if we have to evaluate:

10 / 2

We can just subtract the number 2 to the number 10 more times, until the dividend (10) becomes zero or less than the divisor (2). So:

10 - 2 = 8     (1 time)
8 - 2  = 6     (2 times)
6 - 2  = 4     (3 times)
4 - 2  = 2     (4 times)
2 - 2  = 0     (5 times)

So, 2 can be contained in 10 five times. That’s what we were looking for: 10 / 2 = 5, that’s easy.

Such an algorithm is quite inefficient, as many subtractions are required. We can improve that algorithm by using powers of two.

The principle goes like this. Suppose we want to perform A / B. If we double B and if it can still be contained in A, then we know B can be contained in A at least two times. If we double B again and it can still be contained in A, we know that B can be contained in A four times… and so on. At a certain point, we cannot multiply by two any longer: A cannot hold such a multiple of B. So we subtract the greatest multiple of B that A can hold to A itself, and repeat the process with the difference, which is now the new dividend. This approach dramatically reduces the subtractions needed.

Let’s try to apply this method:

We must perform: 10 / 2

Set a counter (CT) to 0: CT = 0
Set partial dividend to 0: K = 0

Double the divisor: 4, set K = 4

4 can be contained in 10? yes. So, increment CT by 1 (CT = 1)

Double again: 4 * 2 = 8, K = 8

8  can be contained in 10? yes. So, increment CT by 1 (CT = 2)

Double again: 8 * 2 = 16, K = 16

16 can be contained in 10? no. K = K / 2
(we discard 16 and go back to 8)

So we have:
K = 8 (partial dividend), CT = 2

The quotient between K and divisor (2) is: Q1 = 2^CT = 2 ^ 2 = 4

Now, we have divided by two a part of the original dividend.
The part left to divide is:

10 - 8 = 2. And we set K = 2

Double the divisor: 2 * 2 = 4, K = 4

4 can be contained in 2? No, so we have K = K / 2 = 2, CT = 0

K = 2 (partial dividend), CT = 0

The quotient between K and divisor 2 is Q2 = 2 ^ CT = 1

Now we add the two quotients together to obtain:

Total quotient = Q1 + Q2 = 4 + 1 = 5

That's the final result, as 10 / 2 = 5.



The following Commodore 64 BASIC program shows the above method.

You can find the Commodore 64 machine language implementation of this algorithm in the drawing lines programs source code. I have also created a test version mixed to a BASIC program. The BASIC code is used for input/output and to show the quotient calculated by BASIC math routines.

DOWNLOAD: 31 bit by 15 bit integer division assembly program (test version)

DOWNLOAD: 31 bit by 15 bit integer division assembly program (source code)

Please note that the maximum value for B is 32767, as the divisor is 15 bit. This was just enough to compute the slope for lines, but the routine can be modified to support greater values for B.

The assembly version uses tables to compute the powers of two. As you can see, it is a direct translation of the BASIC program showed earlier. You can even find labels that match BASIC line numbers. Usually you are not adviced to do so, but I found the program quite complex to implement in assembly language, so I did stay close to the BASIC version, even if the resulting code is very far from being optimal. The assembly version uses a subroutine like the BASIC program as well.

Subroutines inside subroutines can be tricky when using assembly language. If the main program calls subroutine1, and then subroutine1 calls subroutine2, what if I want to return from subroutine 2 to the main program? As return addresses are stored on the 6502/6510 stack, if we just issue an RTS instruction from subroutine 2, we will just return to subroutine 1. To return to the main program directly, we must clear the return address to subroutine 1 from the stack. So we code:

PLA
PLA
RTS