Retro means old but cool.

I grew up with the Commodore C64 but was never able to master the machine. I was young, I wanted to play the latest games and let other people do the pioneer work on exploring this incredible hardware. Today I have better skills to catch up on what it takes to code the C64. I will share what I learn along the way. Enjoy the trip to the past!

VIC-II for Beginners Part 2 - To have or not to have Character

Hide and Seek with the Character Generator ROM

When you turn on your C64 and are welcomed with some information about the BASIC Version and the free BASIC Bytes printed on the screen you are actually looking at one of two distinct Character Sets. Check an original C64 keyboard and you will notice that there are lots of symbols and other special characters which can be used - and which by the way makes working with a standard PC keyboard in an C64 emulator so tedious. 

Up to four different characters can be accessed with various short cuts per key.

All this symbols and characters are unalterable stored in the Character Generator ROM, a 4Kb large area overlapping $D000-$DFFF. f you have followed the other articles you know that this area is crowded. We have RAM under ROM, we have I/O mapped registers and now there is also the Character Generator ROM! How do we access what we need at a particular moment? To understand this, we need to look at the CPU 6510 and the VIC-II separately. 

From a CPU perspective $D000 to $DFFF is I/O-mapped memory including a 1Kb block of Color RAM. The Character Generator ROM is actually not in sight of the CPU at this point. It will let the VIC-II take care of working with it as required. Since we usually want to access the Character Generator ROM at least once for the purpose of copying it to some other location in RAM we first need to switch out that I/O mapped area by changing the memory location $01. It is used to configure various memory layouts in the C64. With this done the CPU can now see the Character Generator ROM and we are able to read and copy everything to  RAM to be able to change characters to our liking. After copying don't forget to switch the I/O mapping in again! 

From a VIC-II perspective he Character Generator ROM is only visible in Bank 0 and Bank 2. In either bank the VIC-II  sees it at location $1000-$1FFF. Technically the Character Generator ROM is of course located at $5000 in Bank 2 but the VIC-II does not know at which bank it is looking so it always assumes it is watching a block of 16Kb of memory starting at $0000. In the other two banks 1 and 3  there is no Character Generator ROM shadow image available. That means if we want to point the VIC-II to Bank 1 or 3 but still need to use some of the original Commodore characters we have to to copy it beforehand. However even when working in Bank 0 and Bank 2 we usually still want to copy Characters from ROM to the RAM underneath so we can modify it.

Here is a routine to copy the complete Character Generator ROM in Bank 0 from it's ROM location at $D000 to the RAM underneath at the same address:

Looking at the Character Sets

Each character in a set is stored in a 8x8 Bit matrix, in other words every character takes up 8 Bytes of space in memory. In our 4Kb ROM area we have therefor space for 512 Characters and this space is actually completely populated. The 512 characters are divided into two sets with 256 characters each, let's call them Set A and Set B. Each set is again split into standard and negative-imaged versions of each individual character. 

You can only use one of the 2Kb large sets, that is 256 characters at a time. In BASIC interactive mode when you hit SHIFT+Commodore Key you can switch between Set A and Set B. Ultimately everything currently on the screen will change to correspond to the selected set. You can not write to the screen with say exclusive symbols from Set A1 and then continue writing with lower case characters from Set B1 without affecting the symbols entered prior.

That's why the possibility to copy character sets into RAM to change it to ones use-case is so important. Basically every game or demo uses a custom font often just loosely based on one of the original Character Sets.

Four 128-Sets of Characters 

Microscoping into the Character Generator ROM

Let's examine how the characters are actually stored within the ROM.  Every character is a consecutive series of eight Byte values to represent the 8x8 Bit matrix. When looking at the Binary representation we can see that a turned on Bit is considered to use the active foreground color and if a Bit is not set then the active background color is applied on the screen. 

Lets confirm that the Byte values for the letter A are $18, $3C, $66, $7E, $66, $66, $66 and $00 as shown in the Binary representation below. We compare with a ROM dump of the Character Generator ROM.

the 'A'-character and how it is stored in memory. Each of the 8 rows are a Byte value consecutively stored in the Character Generator ROM.

Character Generator ROM Dump

PETSCII - the Commodore way of not using ASCII

PETSCII stands for PET Standard Code of Information Interchange and is loosely based on the ASCII standard but offers lots of symbols not available in the original ASCII set. The reason for a custom standard were the limitations of older Commodore PET machines. Their Character Set was not changeable and there was no Bitmap mode for drawing graphics as opposed to the capabilities you have with the Commodore C64. So lots of symbols were added to make drawing to the screen somewhat flexible. There are also some fun facts like there are for example card suit symbols in the PETSCII set of characters. The idea was that Card Games should be easy to be programmed on Commodore systems. PETSCII is also often referred as CBM ASCII. Besides Letters and Symbols, PETSCII also includes cursor and screen control codes to clear the screen for example.

The following two screenshots show to the left Character Set A which includes only Upper Case letters but all available symbols while to the right there is Character Set B which includes both, Upper and Lower Case letters.. To compensate for the required space Character Set B has to go without some of the symbols. Character Set A is often referred as unshifted and Character Set B as shifted set. Note, that the screens don't show cursor or screen control codes. 

Unshifted Characters

Shifted characters

Screen Codes vs PETSCII

The final detail which often leads to confusion is the difference between Byte values we put into Screen RAM to display some text and the actual values PETSCII defines and which are used for example in BASIC.

When we write Bytes to Screen RAM we count up from $00 starting with the @ sign having Screen Code 0, the letter A has Screen Code 1 etc. The reversed presentation of each character requires a simple addition of $80 or Decimal 128 since the second part of any of the two Character Sets is exactly 128 Bytes off. Since 128 is the Decimal value of the highest Bit in a Byte, the switching between standard and reversed characters is as simple as EOR'ing against %10000000 what I already explained in the Bit Manipulation article.

This should do it for a start regarding the Character Generator ROM. We will revisit characters when we want to built custom fonts or use modified character sets as background graphics.