Converting numbers to strings

As a segway to more complex mathematical functions, let’s tackle converting an integer from its usual numeric representation (e.g. 243) to its string representation (e.g. “243”).

You might be thinking: “What is the point? I just call a PRINT statement and it prints the number”, but what is it that the PRINT command really does? And why is that linked to mathematical operations such as multiplication and division…and why are these “more complex” ???.

The short answer is that computers do not really know how to PRINT stuff in the same way that they know how to add or subtract numbers.

For example, out of the box the Digirule2 does not handle numbers “wider” than 8 bits. Even if we implement 16 bit integers along with their 4 fundamental operations (mul, div, add, sub), we would still not be able to “see” those 2 individual bytes as 1 number. That is, the number 939 (0b 0000 0011 | 1010 1011) would only be “perceived” as 0x03, 0xAB. The point of the conversion here is to go from 0x03, 0xAB to "939".

Coincidentally, the key mathematical operation behind converting an integer from its binary representation to a human readable (decimal) representation, is division, which, the Digirule2 does not have an operation for.

Let’s give Digirule the capability to print the string representation of numbers…

Outline of the conversion

The key tool in converting numbers to glyphs that spell out a value, is the ASCII table.

This is a standard feature of modern computers and it is a part of memory that contains pictures that correspond to numbers. When you try to print a string via a language’s print functions, the computer scans the string, looks up the pictures that correspond to the bytes composing the string and sends them off to video memory.

From the point of view of the computer, the word Digirule looks like [68, 105, 103, 105, 114, 117, 108, 101], with 68 being the code for the picture of D, in the ASCII table <https://ascii.cl/>.

And similarly, from the point of view of the computer, the word 137, looks like [49, 51, 55], with 49 being the code for 1.

Therefore, to print a number in a human readable (decimal) form, all we have to do is break it down to its composing powers of 10 and then “adjust” those coefficients for their ASCII representation.

For example, 137 is:

\[137 = 1 * 10^2 + 3 * 10^2 + 7 * 10^0\]

Similarly, 17329 is:

\[17329 = 1 * 10^4 + 7 * 10^3 + 3 * 10^2 + 2*10^1 + 9 * 10^0\]

The difference between 1 as a number and "1" (a string) is that "1" is actually the number 49.

Therefore, at its heart, the conversion of a single number in the range 0-9, to its human readable (decimal) form comes down to adding 48 (the value for "0") to that number.

With this in mind and the way numbers are composed by their constituent powers of 10, we can start piecing together a generic conversion algorithm:

  • Given the bit depth (\(b\)) of a number, work out the highest power of 10 (\(d\)) possible.
    • For example, for \(b=8\) bit integers, the largest number is 255 which means that the highest power of 10 here would be :math`d=2` (because \(255 = 2 * 10^2 + 5 * 10^1 + 5 * 10^0\).)

  • Now, to get the coefficient of the “hundreds”, perform an integer division by \(10^2 = 100\)
    • \(255 \backslash 100 = 2\)
      • Ratio : \(2\)

      • Remainder : \(55\)

    • Convert the Ratio to a character by adding 48: \(2+48=50\)

    • Output: 50

  • Take the remainder and perform integer division by \(10^1 = 10\)
    • \(55 \backslash 10 = 5\)
      • Ratio : \(5\)

      • Remainder : \(5\)

    • Convert the Ratio to a character by adding 48: \(5 + 48 = 53\)

    • Output: 53

  • Take the remainder and perform integer division by \(10^0 = 1\)
    • There is no reason to divide by 1 here, therefore, let’s simply say:
      • Ratio : \(5\)

      • Remainder : \(0\)

    • Convert the Ratio to a character by adding 48: \(5 + 48 = 53\)

    • Output: 53

And this is how the number 255 is converted to the character sequence (a.k.a string) [50, 53, 53]

Fantastic! That’s so repeatable and well defined, let’s get coding!!

Slight problem here: The Digirule2 ASM does not include a command to perform the integer division step. Out of the box, we have absolutely no way of realising all these divisions by the powers of 10.

Out of the box, it is impossible to divide.

Dividing Integers

The Digirule2 ASM only contains low level commands for addition (ADDLA, ADDRA) and subtraction (SUBLA, SUBRA). In other words, at its lowest level, the chip does not “know” how to divide two numbers.

But that is not too much of a problem. If you think about it, multiplication of some number \(a\) by another number \(b\) is “shorthand” for “add \(a\), \(b\) times”. Similarly, it is not difficult to think about the reverse operation of division of some integer \(a\) by another integer \(b\) as “shorthand” for “keep subtracting \(b\) from \(a\) and tell me how many times it fits and what, if anything, remains”.

Now, it is worth noting here that multiplication and division have long 2 been seen as iterative processes and wherever there is iteration, there is a strong motivation for simplifying it and even making it faster where possible.

The wikipedia pages on division and multiplication algorithms are a very good start for more information about those algorithms.

Different algorithms suit different needs.

For the purposes of this routine of converting integers to strings, here, we will use the very simple iterative version of division where we simply keep subtracting the divisor from the dividend until it cannot be divided any more.

This is as follows:

Given two numbers \(a\) (The dividend, the number to divide) and \(b\) (The divisor, the number to divide by) and aiming to produce two numbers \(R\) (The ratio \(a \backslash b\)) and \(M\) (The remainder) :

  1. Set \(R=0\), \(M=a\)

  2. Perform \(R = R + 1\)

  3. Perform \(M = M - b\)

  4. Check if the subtraction produced a negative number
    • If it did not, continue from step 5

    • If it did continue from 6

  5. Check if the subtraction produced 0 (zero)
    • If it did not, continue from step 2

    • If it did continue from step 8

  6. Perform \(M = M + b\) (Need to adjust \(M\) because it already produced the negative number)

  7. Perform \(R = R - 1\) (Need to adjust \(R\) because it would have overshot the ratio)

  8. STOP

Now, all calculations within steps 1-8 can be performed with Digirule2 ASM and in addition, the Digirule2 can now perform integer division for 8bit numbers, which means that there is nothing stopping us from implementing the int2str function above.

2

And by “long” here we mean thousands of years.

Putting it all together

So, now we are ready to start writing some ASM. And this, again, includes some “housekeeping” details that require some planning on their own right.

This is because when we were outlining the algorithms above, we may have given names to variables or never bothered with the limitation of not having an infinite ammount of memory which we can use to store our values in. Or any other constraints that may be imposed on us by the hardware (You cannot have floating point (\(\mathbb{R}\)) numbers, you cannot add without using the Accumulator and so on).

We would be aiming to write a stand-alone routine that is generic and re-usable in other programs too and one that interferes with the state of the processor as little as possible.

For this purpose, in the following listing, I am using some generic registers r0, r1, r2, ... and so on and some “temporary registers”. The purpose of the generic registers here is to pass parameters and hold the result of computations and the purpose of the temporaries is to hold intermediate stage of computation results.

For more details, please see inline code comments below:

  1.EQU status_reg=252
  2
  3# Program Entry Point
  4# Convert decimal 255 to string in variable `asc_string`
  5COPYLR 255 r0
  6COPYLR ascii_str r1 # This COPYLR copies the VALUE of the label to r1. The value of the label is its MEMORY ADDRESS.
  7CALL int2str
  8HALT
  9
 10int2str:
 11# Transforms an integer to an ASCII representation
 12# Input:
 13#    r0: int
 14#    r1: Address of the first byte of the string
 15# Note:
 16#    The Digirule2 does not understand the string data type. In this
 17#    example, a "string" is just three sequential bytes in memory,
 18#    starting at the memory address indicated by register r1.
 19#
 20# Save the arguments because they will be required later
 21COPYRR r0 t0
 22COPYRR r1 t1
 23# Divide by 100
 24COPYRA r0
 25COPYLR 100 r0
 26CALL div
 27# Adjust the ascii rep and copy to the next available string position
 28CALL int2str_adjust_and_copy
 29# Load the remainder, divide by 10
 30COPYRA r0
 31COPYLR 10 r0
 32CALL div
 33CALL int2str_adjust_and_copy
 34# Load the remainder
 35# Notice here, the remainder is in r0 but it does not
 36# go through the division. For this reason, it is simply
 37# copied across from r0 to r1 to be able to re_use the
 38# adjust_and_copy procedure as is.
 39COPYRR r0 r1
 40CALL int2str_adjust_and_copy
 41RETURN
 42int2str_adjust_and_copy:
 43# Adds the ascii 0 value (48) to the order of mag multiplier
 44# and puts the result to the next available string position
 45COPYRA r1
 46ADDLA 48
 47COPYAR t2
 48COPYLR t2 cpy_from
 49COPYRR t1 cpy_to
 50CALL cpy_ind
 51INCR t1
 52RETURN
 53
 54
 55div:
 56# Performs the division Acc/r0
 57# Input:
 58#   Acc: int (Dividend)
 59#    r0: int (Divisor)
 60# Output:
 61#   r1:Ratio
 62#   r0:Remainder
 63CBR 0 status_reg # Zero bit
 64CBR 1 status_reg # Carry bit
 65COPYLR 0 r1
 66sub_again:
 67INCR r1
 68SUBRA r0
 69# Check if the last SUB hit zero
 70BCRSS 0 status_reg
 71JUMP check_carry
 72JUMP sub_again
 73check_carry:
 74# Maybe the last SUB overshot zero
 75BCRSS 1 status_reg
 76JUMP sub_again
 77# Adjust results
 78ADDRA r0
 79DECR r1
 80COPYAR r0
 81RETURN
 82
 83cpy_ind:
 84# Indirect copy
 85# Input:
 86#   cpy_from: The memory address to copy from
 87#     cpy_to: The memory address to copy to
 88# Output:
 89#   None. The procedure has side effects only
 90.DB 7     # Digirule2 ASM COPYRR opcode
 91cpy_from:
 92.DB 0
 93cpy_to:
 94.DB 0
 95RETURN
 96
 97# General Registers
 98r0:
 99.DB 0
100r1:
101.DB 0
102t0:
103.DB 0
104t1:
105.DB 0
106t2:
107.DB 0
108
109ascii_str:
110.DB 0,0,0

(This is part of listing dg_asm_examples/itoa/int2str.dsf )

The usual workflow is followed to compile it here, assuming that dgtools is installed already:

> dgasm.py int2str.dsf
> dgsim.py int2str.dgb -ts ascii_str:3

If you now open int2str_trace.md and scroll all the way to the last step of computation, you will notice the ASCII representation of the three bytes starting from memory location ascii_str.

Conclusion

This concludes this section on creating a procedure so that we can “see” numbers in a human readable (decimal) form.

Although we implemented it for 8bit integers, the function is entirely generic and can be generalised for any size integer. That is, as long as the div function is generalised as well.

The same function can be further generalised to convert integers to a human readable form in other representations besides decimal as well. The process is exactly the same, but instead of dividing by powers of 10, we simply divide by powers of the base in the desired number system. In fact, this is exactly the way the itoa() function works.

Finally, as can be seen here, we are starting to set the foundation for a “Super CPU”, a CPU that sits on top of Digirule2 and is capable of more operations (e.g. div, mul, int2str) than the original ASM permits, which in turn leads to more complex programs.

How about 16bit math next?