Numbers can be fun, and talking about 6502 CPU math is just another way to look at topics we already know from simple math. No advanced mathematics is actually involved in 6502 math routines. Still, we have to adapt our mind a bit at times.

The 6502/6510 microprocessors are known to have only addition and subtraction instructions. They actually have the added bonus of being capable of performing multiplication by two on hardware. Integer division by two is also possible.

Starting from what we have, we can implement on software the missing instructions (generic multiply and divide instructions).

As we all know, the 6502 is an 8 bit microprocessor. That means, only numbers from 0 to 255 can be stored on a given memory location. However, there is a way to link more 8 bit numbers together so that we can make calculations with 16 bits or 32 bits numbers. Or even more. The key to joining numbers together is the **carry flag**. We will see this in a moment.

### Addition

Addition is performed by the instruction **ADC**. It means: “ADd with Carry”. The carry is always added during the calculation. So, before adding two numbers together, we must clear the carry by using the instruction **CLC** (CLear Carry).

8 bit addition may be coded as follows:

clc lda num1 adc num2 sta result rts

So, if num1 holds 10 and num2 25, result will contain the value 35. Pretty straightforward.

But, if we try to add together the values 20 and 240? We all know the result (260), but it exceeds the 8 bit limit (255). If you try to run that piece of code with *num1* = 20 and *num2* = 240, you will see that *result* will contain the value 4. In facts, the result exceeds the value 255 by 5 units. When you add 1 to 255, the result wraps to 0. That’s why you end up with 4 units on the result. Here is what happens:

255 + 1 = 0 + 1 = 1 + 1 = 2 + 1 = 3 + 1 = 4

But, when a unit is added to 255, something happens. The carry flag is set to 1. We can use this information in order to provide a 16 bit result. Have a look at the following code:

clc lda num1 adc num2 sta result_low lda result_high adc #$00 sta result_high rts

Since ADC always adds the carry, if the low byte of the result (*result_low)* wraps around to 0, then the carry will be set to 1, and the apparently useless instruction *adc #$00* will actually add 1 to the high byte of the result (*result high*). Of course, we are supposing location *result_high* is initialized to zero before running the above code.

So, the result is split in two locations:* result_high* and *result_low*. If we add 240 and 20 together, the high byte will be 1, and the low byte will be 4.

The high byte turns to 1 as soon as the low byte wraps around to 0 from 255. So, the digit *1* in the high byte just means: *256*.

So, if we have the result of the addition “240 + 20″in the high byte / low byte form, we can obtain the result as a whole number by doing the following:

256 * high_byte + low_byte = 256 * 1 + 4 = 260 (240 + 20 = 260)

Since 240 + 20 = 60, that is correct.

Now, to add two 16 bit numbers together, we can proceed as follows:

clc lda num1_low adc num2_low sta result_low lda num1_high adc num2_high sta result_high rts

Addends are both given in the high byte, low byte form. We add the low bytes of addends, then the high bytes.

Please note that CLC is not performed before adding the two high bytes of the addends! This is important, as we must take into account the carry that may result from adding the low bytes. This is the way to join together 8 bit numbers in order to form larger numbers.

Some more explanations on numbers “splitting” may be useful at this stage. Suppose we have to add 100 and 300 together. 100 needs only 1 byte, so we will set num1_high to 0 and num1_low to 100.

300 needs two bytes instead. As each “unit” in the high byte means “256”, the high value is obtained from the following integer division:

num2_high = INT (300/256) = 1

The low byte is then calculated like this:

num2_low = 300 - num2_high * 256 = 300 - 256 = 44

If we have num2 = 1000 instead, it can be expressed in the high byte / low byte form by using the following calculations:

num2_high = INT (1000 / 256) = 3 num2_low = 1000 - num2_high * 256 = 232 Proof: num2_high * 256 + num2_low = 3*256 + 232 = 1000

Another way to split a number in its high byte / low byte form is to use hexadecimal notation.

1000 decimal equals $3E8 hex. We take couples of numbers starting from the right. So, we have $E8 and $03. The first couple is the low byte, the second couple is the high byte. So:

1000(low byte) = $e8 = 232 1000(high byte) = $03 = 3

### Subtraction

Subtraction is done by using the instruction **SBC** (SuBtract with Carry). There is not a “borrow flag” in the 6502 CPU. The carry flag is not a borrow either, but it acts as a *reverse borrow*. So, before performing a subtraction, we should clear the borrow, or, as it is done in practice, we must SET the carry.

Things may be set up very similarly to the addition code. So, to perform 16 bit subtraction we may code as follows:

sec lda num1_low sbc num2_low sta result_low lda num1_high sbc num2_high sta result_high rts

Please note that after low bytes subtraction, the carry is NOT cleared. Again, this is to preserve the carry information for the high bytes.

### Multiplication and division by two

Although multiplying or dividing two generic numbers together is not provided by the hardware, it does offer us the possibility of performing multiplication or division by two.

8-bit multiplication by two is performed by the instruction **ASL**.

If you want to multiply a number by ten, you just add a 0 to the right of it, shifting the number towards the left.

10 * 10 = 100 10 10[]<-add a 0 100

As the CPU works with base 2 numbers, adding a 0 to the right of a number thus shifting it to the left just means: multiply by two.

If we have the number 15, shifting it to the left one time will just give us 30.

15 decimal = 00001111 00001111 do the shift to the left... [0]0001111[]<-add a 0 here /\ / \ || || "discard" this number (we will see where it goes...) 00011110 00011110 = 30 decimal

So, if the number to multiply by two is stored on location *num1*, we can use the code:

ASL num1

Here, ASL shifts the content of location *num1*. So, *num 1* will contain the result. If we don’t want to change *num1* and rather store the result in the location *result*, we then may code:

LDA num1 ASL STA result

Now, ASL shifts the content of the accumulator. So, the ASL instruction comes in two flavours.

Suppose we want to multiply by two the decimal number 255:

LDA #$ff ASL STA result

Let’s use binary numbers to do the shift:

$ff = 11111111 base 2 C<---[1]1111111[ ]<---0 C=1 11111110 base 2 = 254 decimal

So, we end up with a value of 254 decimal on location *result*. In the process, the leftmost 1 digit of the number to by multiplied has been thrown away. But, it is not actually lost. The CPU stores it in the carry flag. Again, the carry flag keeps the missing information, and we can use it to build up a 16 bit result. With that in mind, here is the code:

LDA num1 ASL STA result_low BCC end INC result_high end RTS

The code just says this: when multiplication by two of num1 is done, if the carry flag is clear, don’t do anything. If it is set, put a 1 on result_high (or, which is the same, increment it by one). Of course, we suppose result_high is initialized to zero before running the above code.

But, we can use another approach. We can use two bytes to hold num1, then shift the whole 16 bit number. When shifting the low byte, we just ASL it. Now, we cannot proceed by doing an ASL to the high byte: we would just enter on the high byte with a 0 from the right. We instead need to enter on it with the carry, so that we keep the information from the first shift. We perform this by using the instruction** ROL** (rotate left). Look:

0000000011111111 (16 bit number to be multiplied by two, 255 decimal) [0]0000000111111110 (shifting the whole number) On each byte: LOW BYTE (ASL, we enter with a0): C <- 11111111 <-0We obtain: C= 1, 11111110 HIGH BYTE (ROL, we enter with thecarry) C <- 00000000 <-C = 1We obtain: C = 0, 00000001 Joining high byte and low byte together: 0000000111111110 (510 decimal, correct) ********######## * = high byte digit, # = low byte digit

So, if *num1 *is the 16 bit number to be multiplied by two, we can make use of the following code:

ASL num1_low ROL num1_high

Again, if we want not to change *num1* and store the result on a desidered location:

LDA num1_low ASL STA result_low LDA num1_high ROL STA result_high

Division works in a very similar way, but we have to use the instructions **LSR** (Logic Shift Right) and **ROR** (Rotate Right). As you may have guessed, **LSR **is similar to ASL, and **ROR** is similar to ROL: both enter with a 0 or the carry respectively, but the shift is now performed towards the right.

Still, this is integer division, so we may have a remainder. But we won’t worry about that now.

Dividing the 8 bit number *num1* by two is performed by the code:

LSR num1

or:

LDA num1 LSR STA result

Supposing that num1 equals 128, we can see what happens by using binary numbers:

128 decimal = 10000000 base 2 LSR 128 0 --> 10000000 --> C 01000000 --> C=0

It is fairly easy to see that the result is 64. As 128 is an even number, no remainder is produced. The carry flag holds zero at the end.

Let’s try to divide 129 decimal by two:

129 decimal = 10000001 base 2 LSR 129 0 --> 10000001 --> C 01000000 --> C=1

Now the carry holds 1 at the end. That’s the remainder.

Once again, we can use the carry to “link” eight bits numbers together, thus performing operations on numbers greater than 8 bits. So, the following code will multiply a 16 bit number by two:

LSR num1_high ROR num1_low

Note that we start from the **high byte **this time. Since we are performing a right shift/rotate, we must enter the whole number with a 0 from the left. So, that’s why we LSR the high byte first.

If *num1* is a 16 bit number and equals 256 decimal, let’s see again what happens by using binary numbers:

num1 = 256 decimal = 0000000100000000 base 2 LSR num1_high: 0--> 00000001 --> C 00000000 --> C = 1 ROR num1_low: C--> 00000000 --> C 1--> 00000000 --> C 10000000 --> C = 0 Final result: 0000000010000000 = 128 decimal (256/2 = 128, correct). ********######## * = high byte digit, # = low byte digit

As you can see, the first carry we obtain with the LSR is used as a link to keep the information for the second operation (ROR). The last carry we obtain is 0. This is the remainder for the whole division. And, as 256 is an even number, dividing it by two brings us a 0 remainder.