Converting numbers to strings

As a segway to more complex mathematical functions, let’s tackle converting an integer from its usual numeric representation (e.g. 243) to its string representation (e.g. “243”).

You might be thinking: “What is the point? I just call a PRINT statement and it prints the number”, but what is it that the PRINT command really does? And why is that linked to mathematical operations such as multiplication and division…and why are these “more complex” ???.

The short answer is that computers do not really know how to PRINT stuff in the same way that they know how to add or subtract numbers.

For example, out of the box the Digirule2 does not handle numbers “wider” than 8 bits. Even if we implement 16 bit integers along with their 4 fundamental operations (mul, div, add, sub), we would still not be able to “see” those 2 individual bytes as 1 number. That is, the number 939 (0b 0000 0011 | 1010 1011) would only be “perceived” as 0x03, 0xAB. The point of the conversion here is to go from 0x03, 0xAB to "939".

Coincidentally, the key mathematical operation behind converting an integer from its binary representation to a human readable (decimal) representation, is division, which, the Digirule2 does not have an operation for.

Let’s give Digirule the capability to print the string representation of numbers…

Outline of the conversion

The key tool in converting numbers to glyphs that spell out a value, is the ASCII table.

This is a standard feature of modern computers and it is a part of memory that contains pictures that correspond to numbers. When you try to print a string via a language’s print functions, the computer scans the string, looks up the pictures that correspond to the bytes composing the string and sends them off to video memory.

From the point of view of the computer, the word Digirule looks like [68, 105, 103, 105, 114, 117, 108, 101], with 68 being the code for the picture of D, in the ASCII table <https://ascii.cl/>.

And similarly, from the point of view of the computer, the word 137, looks like [49, 51, 55], with 49 being the code for 1.

Therefore, to print a number in a human readable (decimal) form, all we have to do is break it down to its composing powers of 10 and then “adjust” those coefficients for their ASCII representation.

For example, 137 is:

\[137 = 1 * 10^2 + 3 * 10^2 + 7 * 10^0\]

Similarly, 17329 is:

\[17329 = 1 * 10^4 + 7 * 10^3 + 3 * 10^2 + 2*10^1 + 9 * 10^0\]

The difference between 1 as a number and "1" (a string) is that "1" is actually the number 49.

Therefore, at its heart, the conversion of a single number in the range 0-9, to its human readable (decimal) form comes down to adding 48 (the value for "0") to that number.

With this in mind and the way numbers are composed by their constituent powers of 10, we can start piecing together a generic conversion algorithm:

  • Given the bit depth (\(b\)) of a number, work out the highest power of 10 (\(d\)) possible.
    • For example, for \(b=8\) bit integers, the largest number is 255 which means that the highest power of 10 here would be :math`d=2` (because \(255 = 2 * 10^2 + 5 * 10^1 + 5 * 10^0\).)

  • Now, to get the coefficient of the “hundreds”, perform an integer division by \(10^2 = 100\)
    • \(255 \backslash 100 = 2\)
      • Ratio : \(2\)

      • Remainder : \(55\)

    • Convert the Ratio to a character by adding 48: \(2+48=50\)

    • Output: 50

  • Take the remainder and perform integer division by \(10^1 = 10\)
    • \(55 \backslash 10 = 5\)
      • Ratio : \(5\)

      • Remainder : \(5\)

    • Convert the Ratio to a character by adding 48: \(5 + 48 = 53\)

    • Output: 53

  • Take the remainder and perform integer division by \(10^0 = 1\)
    • There is no reason to divide by 1 here, therefore, let’s simply say:
      • Ratio : \(5\)

      • Remainder : \(0\)

    • Convert the Ratio to a character by adding 48: \(5 + 48 = 53\)

    • Output: 53

And this is how the number 255 is converted to the character sequence (a.k.a string) [50, 53, 53]

Fantastic! That’s so repeatable and well defined, let’s get coding!!

Slight problem here: The Digirule2 ASM does not include a command to perform the integer division step. Out of the box, we have absolutely no way of realising all these divisions by the powers of 10.

Out of the box, it is impossible to divide.

Dividing Integers

The Digirule2 ASM only contains low level commands for addition (ADDLA, ADDRA) and subtraction (SUBLA, SUBRA). In other words, at its lowest level, the chip does not “know” how to divide two numbers.

But that is not too much of a problem. If you think about it, multiplication of some number \(a\) by another number \(b\) is “shorthand” for “add \(a\), \(b\) times”. Similarly, it is not difficult to think about the reverse operation of division of some integer \(a\) by another integer \(b\) as “shorthand” for “keep subtracting \(b\) from \(a\) and tell me how many times it fits and what, if anything, remains”.

Now, it is worth noting here that multiplication and division have long 2 been seen as iterative processes and wherever there is iteration, there is a strong motivation for simplifying it and even making it faster where possible.

The wikipedia pages on division and multiplication algorithms are a very good start for more information about those algorithms.

Different algorithms suit different needs.

For the purposes of this routine of converting integers to strings, here, we will use the very simple iterative version of division where we simply keep subtracting the divisor from the dividend until it cannot be divided any more.

This is as follows:

Given two numbers \(a\) (The dividend, the number to divide) and \(b\) (The divisor, the number to divide by) and aiming to produce two numbers \(R\) (The ratio \(a \backslash b\)) and \(M\) (The remainder) :

  1. Set \(R=0\), \(M=a\)

  2. Perform \(R = R + 1\)

  3. Perform \(M = M - b\)

  4. Check if the subtraction produced a negative number
    • If it did not, continue from step 5

    • If it did continue from 6

  5. Check if the subtraction produced 0 (zero)
    • If it did not, continue from step 2

    • If it did continue from step 8

  6. Perform \(M = M + b\) (Need to adjust \(M\) because it already produced the negative number)

  7. Perform \(R = R - 1\) (Need to adjust \(R\) because it would have overshot the ratio)

  8. STOP

Now, all calculations within steps 1-8 can be performed with Digirule2 ASM and in addition, the Digirule2 can now perform integer division for 8bit numbers, which means that there is nothing stopping us from implementing the int2str function above.

2

And by “long” here we mean thousands of years.

Putting it all together

So, now we are ready to start writing some ASM. And this, again, includes some “housekeeping” details that require some planning on their own right.

This is because when we were outlining the algorithms above, we may have given names to variables or never bothered with the limitation of not having an infinite ammount of memory which we can use to store our values in. Or any other constraints that may be imposed on us by the hardware (You cannot have floating point (\(\mathbb{R}\)) numbers, you cannot add without using the Accumulator and so on).

We would be aiming to write a stand-alone routine that is generic and re-usable in other programs too and one that interferes with the state of the processor as little as possible.

For this purpose, in the following listing, I am using some generic registers r0, r1, r2, ... and so on and some “temporary registers”. The purpose of the generic registers here is to pass parameters and hold the result of computations and the purpose of the temporaries is to hold intermediate stage of computation results.

For more details, please see inline code comments below:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
.EQU status_reg=252

# Program Entry Point
# Convert decimal 255 to string in variable `asc_string`
COPYLR 255 r0
COPYLR ascii_str r1 # This COPYLR copies the VALUE of the label to r1. The value of the label is its MEMORY ADDRESS.
CALL int2str
HALT

int2str:
# Transforms an integer to an ASCII representation
# Input:
#    r0: int
#    r1: Address of the first byte of the string
# Note:
#    The Digirule2 does not understand the string data type. In this
#    example, a "string" is just three sequential bytes in memory,
#    starting at the memory address indicated by register r1.
#
# Save the arguments because they will be required later
COPYRR r0 t0
COPYRR r1 t1
# Divide by 100
COPYRA r0
COPYLR 100 r0
CALL div
# Adjust the ascii rep and copy to the next available string position
CALL int2str_adjust_and_copy
# Load the remainder, divide by 10
COPYRA r0
COPYLR 10 r0
CALL div
CALL int2str_adjust_and_copy
# Load the remainder
# Notice here, the remainder is in r0 but it does not
# go through the division. For this reason, it is simply
# copied across from r0 to r1 to be able to re_use the
# adjust_and_copy procedure as is.
COPYRR r0 r1
CALL int2str_adjust_and_copy
RETURN
int2str_adjust_and_copy:
# Adds the ascii 0 value (48) to the order of mag multiplier
# and puts the result to the next available string position
COPYRA r1
ADDLA 48
COPYAR t2
COPYLR t2 cpy_from
COPYRR t1 cpy_to
CALL cpy_ind
INCR t1
RETURN


div:
# Performs the division Acc/r0
# Input:
#   Acc: int (Dividend)
#    r0: int (Divisor)
# Output:
#   r1:Ratio
#   r0:Remainder
CBR 0 status_reg # Zero bit
CBR 1 status_reg # Carry bit
COPYLR 0 r1
sub_again:
INCR r1
SUBRA r0
# Check if the last SUB hit zero
BCRSS 0 status_reg
JUMP check_carry
JUMP sub_again
check_carry:
# Maybe the last SUB overshot zero
BCRSS 1 status_reg
JUMP sub_again
# Adjust results
ADDRA r0
DECR r1
COPYAR r0
RETURN

cpy_ind:
# Indirect copy
# Input:
#   cpy_from: The memory address to copy from
#     cpy_to: The memory address to copy to
# Output:
#   None. The procedure has side effects only
.DB 7     # Digirule2 ASM COPYRR opcode
cpy_from:
.DB 0
cpy_to:
.DB 0
RETURN

# General Registers
r0:
.DB 0
r1:
.DB 0
t0:
.DB 0
t1:
.DB 0
t2:
.DB 0

ascii_str:
.DB 0,0,0

(This is part of listing dg_asm_examples/itoa/int2str.dsf )

The usual workflow is followed to compile it here, assuming that dgtools is installed already:

> dgasm.py int2str.dsf
> dgsim.py int2str.dgb -ts ascii_str:3

If you now open int2str_trace.md and scroll all the way to the last step of computation, you will notice the ASCII representation of the three bytes starting from memory location ascii_str.

Conclusion

This concludes this section on creating a procedure so that we can “see” numbers in a human readable (decimal) form.

Although we implemented it for 8bit integers, the function is entirely generic and can be generalised for any size integer. That is, as long as the div function is generalised as well.

The same function can be further generalised to convert integers to a human readable form in other representations besides decimal as well. The process is exactly the same, but instead of dividing by powers of 10, we simply divide by powers of the base in the desired number system. In fact, this is exactly the way the itoa() function works.

Finally, as can be seen here, we are starting to set the foundation for a “Super CPU”, a CPU that sits on top of Digirule2 and is capable of more operations (e.g. div, mul, int2str) than the original ASM permits, which in turn leads to more complex programs.

How about 16bit math next?