Filling Memory

Filling All Memory

For no particular reason a question popped into my head: "Is there a Z-80 program that will fill all 64K of addressable memory with the same value?" At first I thought, "Sure, why not?" After all, putting stuff into memory is something processors are designed to do. There's a problem, though, when you get down to the last few bytes as the program that is filling memory has to write the value over itself. It's a little like trying to paint a floor while staying in the room at all times. Eventually you'll have to figure out a way to paint without standing on the floor.

All you Z-80 programmers out there might want to stop and think about this problem for a bit if you like solving puzzles. The first thing that jumps to mind is using LDIR to fill memory. Here's something that looks like it should fill memory with the contents of A register:

        ld      hl,first
        ld      de,first+1
        ld      bc,0
        ld      (hl),a
        ldir
first:

It will only work for two values, $ED and $B0. Why? Consider what happens when the LDIR is about to overwrite itself. LDIR has a two byte opcode, $ED $B0. If A is $B0 then the $ED will be replaced with $B0 and all of memory will have the same value. For $ED the value written will change nothing and LDIR will execute once again, this time writing $ED over $B0 and all or memory will be filled with $ED.

Suppose the value is 0 which is the opcode for NOP. Now when the $ED is overwritten the LDIR instruction will try to execute itself again only to find a NOP instruction there. Thus it won't overwrite the $B0 byte and the program fails.

Actually, there might be one or two other values that we could fill memory with using LDIR. The trick being that the value must correspond to a store-to-memory instruction that happens to wipe out the $B0 when it is executed. Which is fine, but I think we can agree that we're not going to be able to fill memory with all possible values using this approach.

Doing something with an explicit loop only looks worse -- it just makes for more code to be wiped out. Despite that, there is a solution and it is quite wonderfully short:

        org     $8000
start:  ld      d,a
        ld      e,a
        ld      hl,fill
        ld      sp,hl
fill:   push    de
        jp      (hl)
        end     start

The PUSH instruction is one of the few Z-80 instructions that will write two bytes at a time. Even better, it decrements the stack pointer by two so we don't have to update a memory pointer. Continuously looping is a bit of an issue as the shortest branch instruction on the Z-80 takes 2 bytes. There are ways we could use a JR or JP instruction, but it is just as easy to use the single byte JP (HL).

Memory Filled. Now What?

I don't know about you, but I'm not one to leave well enough alone. Having filled memory I now wonder what happens after that? If memory fills with zeros we end up with the Z-80 executing NOP instructions forever. That's nice because memory will remain filled with zeros forever. What about the other values? Only values that correspond to instructions that write memory can be a problem. Anything else will not change memory. Most Z-80 instructions use a single byte to specify the operation but there are 4 special so-called prefix values where a second byte is used: $CB, $ED, $DD and $FD. Luckily those do not present any special problems. $CB $CB is SET 1,E which merely modifies E register. $ED $ED is an undefined instruction which does no operation at all.

$DD and $FD are a little bit more interesting. Normally they are used to indicate the instruction uses the IX and IY registers respectively. At the hardware level they only do do things. One is to set an internal flag saying that the IX/IY instructions are to be used and the other is to disable any interrupts from happening until to get to an opcode. If you have a number of them in a row the net effect is to keep interrupts at bay and (possibly) change which index register (IX or IY) to use. Thus they'll maintain the memory state and will treat all of memory as one giant instruction that never finishes. This amuses me. The second longest Z-80 instruction is 65536 bytes and the longest is infinite bytes.

The instructions that directly write a value to memory are easy enough to handle. They are:

        ld	(bc),a
        ld	(de),a
        ld	(nn),hl
        ld	(nn),a
        ld	(hl),n
        ld	(hl),b
        ld	(hl),c
        ld	(hl),d
        ld	(hl),e
        ld	(hl),h
        ld	(hl),l
        ld	(hl),a

Easiest of all is LD (HL),N. That writes an immediate value to memory location HL. But the immediate value will be that value memory is filled with so it will have no effect. All the others will be just fine as long as A, B, C, D, E, H and L registers are loaded with the value before we start the PUSH DE; JP (HL) loop. HL is slightly inconvenient as it means we'll have move the two byte loop to different locations depending on the value used.

The trick works for the PUSH instructions except that we need not just A loaded with the value but the F flag register as well otherwise PUSH AF will break the state. Officially the Z-80 makes no promises that the F register can store a full 8 bits. It turns out F register will store a full 8 bits and a PUSH HL; POP AF will end up with AF having identical contents to HL.

The CALL instructions and the special purpose RST call instructions is where our luck runs out. They write to memory by putting the return address on the stack. If we're very careful and make the return address of the call equal to value*257 then the first time they run they'll not alter memory. But then we'll run another CALL instruction and this time the return address pushed will not be value*257 and the memory state will be broken.

Actually, only the unconditional CALL is a problem. For conditional calls we only need set up the F register to make the condition false. The calls won't occur and memory will remain unchanged. Here we get two lucky breaks. First, rather than have to handle each of the eight conditional calls individually we can use bit 3 of the opcode. If it is set then the flag register should be 0 to make the condition false and to $FF if bit 3 is clear. Second, the condition handling still works if we use $F5 as the value. That's super convenient because we can blindly the "bit 3" rule to load F register. It gets the right answer for the conditional CALL instructions and it also loads F register with $F5 when given the value $F5 which is the opcode for PUSH AF. One rule handles all the cases of loading the flags register!

There are two remaining troublesome instructions: INC (HL) and DEC (HL). Clearly when they execute they'll break the state. But if we set up things just right memory will return to the same value state every 256 instructions. Not perfect, but something to aim at.

Putting it all together, here's a program that fills all of RAM with the value in A register and maintains that state as long as possible. Which is indefinitely for most values, 1 instruction for the RST and unconditional CALL instructions and once every 256 instructions for INC/DEC (HL).

        org     $8082
start:  ld      h,a
        ld      l,a
        ld      b,a
        cp      $cd
        jr      z,iscall
        cp      $34
        jr      z,idhl
        cp      $35
        jr      nz,notid
idhl:   ld      sp,hl
        pop     de
        ld      de,$e9d5
        inc     sp
        push    de
        ld      d,a
        jr      cont
notid:  and     $c7
        cp      $c7
        jr      nz,norst
        inc     hl
        inc     hl
iscall: dec     hl
        dec     hl
        dec     hl
        dec     hl
norst:  ld      sp,hl
        pop     de
        ld      de,$e9d5
cont:   push    de
        ld      a,b
        and     8
        add     $f8
        sbc     a,a
        cpl
        and     $f5
        ld      c,a
        push    bc
        pop     af
        ld      c,a
        ld      d,a
        ld      e,a
        jp      (hl)
        end     start

I tried to make the program as compact as possible but no doubt it could be made smaller. The "iscall" and "norst" make it pretty clear where the CALL and RST cases are handled and "idhl" is where INC/DEC (HL) is done. There's a cute little bit in there that Lamar Owen suggested. The LD D,A puts a INC (HL) or DEC (HL) as one of the instructions executed in the fill loop and has it operate on itself. This might seem bad, but INC (HL) has opcode $34 so it changes itself into DEC (HL) which has opcode $35. And it changes itself back to INC (HL) so the instruction harmlessly flips between the two values.

Other Processors

With multi-byte push instructions I'm sure the 6809 could use the same trick to fill memory. Though its more complex instruction set will likely make it harder to maintain the state. The same goes for 68000. I'm doubtful the 6502 could do an arbitrary memory fill. Since I'm only using 8080 instructions it can obviously do it. And I'll bet there's a way for an 8086. ARM has multi-register move instructions so its easy. MIPS might well be a problem in 32 bit mode but 64 bit variants probably have a chance. Though on advanced processors with instruction caches the problem can be solved sneakily. Just fill memory with an ordinary loop. The loop will quickly go into the instruction cache and can write "over" itself with impunity. In fact, I'm pretty sure that's how I ended up erasing all the RAM in a PlayStation 2 game I was working on, but that's a story for another time.

George Phillips, May 8, 2016. george -at- 48k.ca