Z180
Memory Management
Copyright 1990,
Jack G. Ganssle
Abstract
Editor's Note: For Z80/180 projects Avocet
recommends the the Avocet ADC Compiler, Assembler, Simulator,
IDE
The Z180 MMU
is confusing, but quite useful when well understood.
Published in
Circuit Cellar Ink, February 1990
Programs are
getting big! Part of today's shift towards 16 and 32 bit processors comes
from the need for correspondingly huge address spaces, since conventional
wisdom holds that a 512kb program just cannot fit in the 64k address space
of most 8 bit CPUs. Where performance is the overriding concern, a 32 bit
CPU may be the only solution. It does seem a shame to abandon all the accumulated
knowledge and code gleaned from two decades of 8 bit microprocessors just
to get more programming elbow room.
For the past
few years some 8 bit CPUs have been equipped with memory management units
(MMU) that free programs from most memory
limitations. It's tedious and complex to control a MMU manually; now, many
languages and other tools include built-in MMU support.
Logical-Physical
The problem of
memory management is easy to define: we need some way of connecting lots of
memory to a processor that just cannot handle
or address it. For example, we might want to put 512kb on a Z80. Since the
Z80 only generates 16 bit addresses, it can only directly
address 64k of RAM. Somehow, though memory management, we must expand this
capability.
For now let's
assume that magic hardware gives us more address lines. Perhaps it is as simple
as an I/O port loaded by the CPU with an
extra upper 8 address lines (A16 to A23), giving us a potential address space
of 24 mb. Or, it can be hideously complex, providing some
ability to access different sections of address space in wild and wonderful
ways.
In any event,
as soon as some external mechanism is added to translate addresses in some
fashion, the programmer suddenly must contend
with two very different sorts of address spaces.
"Physical"
memory is that actually connected to the hardware. For example, the 512kb
we attach to the sadly-overloaded Z80 is
physical memory. Its address ranges from 00000 to 80000 hex in a linear manner.
"Logical"
memory is the memory currently located in the processor's address space. Obviously,
if the computer can only issue
addresses in the range of 0000 to FFFF (0 to 64k), then some of the physical
memory is visible and some is not. As the code changes the
memory manager's settings different memory becomes visible. That which is
addressable at any time is the logical memory.
Thus, addresses
generated by the program are always logical addresses - they get translated
by some yet undefined hardware into real
physical addresses.
So, at one time
address 1000 logical might be translated into 28000 physical. Later, in the
same code, 1000 could correspond to 80000
physical. The old one-to-one mapping of addresses we're all familiar with
is gone!
In summary, addresses
used by the code are logical; the memory array sees physical. Between these
two the memory management unit (MMU)
falls. Standard Architectures
Several years
ago Hitachi introduced the 64180, a high integration version of the venerable
Z80. While other vendors were trying to push
new proprietary architectures, Hitachi took what might seem a step backwards
towards the Z80. They realized an important fact of the
industry - customers had a fortune invested in Z80 code and were unwilling
to switch to an incompatible instruction set.
The 64180 is
a Z80 at heart. The designers resisted the temptation to add fancy new instructions
and addressing modes that could have made
it incompatible with the Z80. Rather, they integrated timers, serial ports,
and DMA controllers onto the chip. Even better, they added a
memory management unit to translate 64k logical addresses into a 1 mb physical
address space.
Now Hitachi sells
several other versions of the part. The 64180S is designed especially for
telecommunications. The 647180X is a
microcontroller version, containing a 64180 core, ROM, RAM, and parallel I/O.
Zilog stepped into the act, offering the Z180 (a second
source of the 64180) and Z280, a very high performance Z80 upgrade. Zilog
is just now announcing the Z181 and will soon offer a
microcontroller version of the part, probably a 647180X look-alike.
The most important
peripheral on the 64180-family processors is the memory management unit (MMU).
The MMU is a hardware device built onto
the processor's silicon. The MMU translates every memory address from 16 to
20 bits.
The 64180's MMU
uses three internal control registers. In keeping with the chip's design philosophy,
on reset the MMU gives a straight
logical to physical mapping, simulating the Z80 and, of course, limiting the
address space to 64k.
You can divide
the 64180's logical address space into one, two, or three areas. The logical
space itself is unaltered; even when divided
it is still a contiguous 64k.
CBAR is an 8
bit I/O port that can be accessed by the processor's OUT and IN instructions.
The lower 4 bits specify the starting address
of the bank area, and the upper 4 give the start of common 1. These bits determine
the upper four bits of the address. If CBAR were A080,
then the base area starts at 8000 logical, and common 1 starts at A000.
Common 0, if
it exists, always starts at logical 0000 and runs up to the bank area. The
bank area then runs to the start of common 1.
Therefore, you
can always understand the logical address space by examining the contents
of CBAR by itself. No other information is
needed.
The logical address
is only part of the problem. How does logical space get mapped to physical?
Two other ports provide the rest of the
answer.
BBR (the Base
Area Bank Register) specifies the starting physical address of the base area
(remember, the logical start is in CBAR). CBR
(Common Bank Register) provides the same information for common 1. Both of
these specify the upper 8 bits of the 20 bit physical address.
A simple formula
gives the translation from logical to physical address for the bank area:
Physical = Logical
+ (BBR * 4096)
The same formula
gives Common 1:
Physical = Logical
+ (CBR * 4096)
BBR and CBR gives
the upper 8 address bit only - hence the 4096 multiplier. The lower 12 bits
come from the logical address. Thus, the
translation only affects the upper 8 bits; the lower 12 physical bits are
always identical to the lower 12 logical.
On reset, the
64180 sets CBAR to F0, and CBR=BBR=0. This maps logical to physical exactly,
with no translation; the bank area starts at
logical 0 and common 1 at F000 (since CBAR=F0), the bank area physically starts
at 0000 (BBR=0), as does common 1 (CBR=0). If the logical
address is 1000, then the MMU allocates this to the bank area (CBAR=F0; 1000
is less than the start of common 1 at F000), and adds the
physical base of bank to it (0), giving a translated address of 01000. Similarly,
logical F800 is in common 1, and translates to 0F800.
The most important
point that can be made about the MMU is that it does not provide the 1mb linear
address space we all crave. After all,
Z80 instructions use 16 bit address operands and 16 bit register pointers
- there is no way to address a number larger than 64k. A jump
instruction will always have an argument that is 16 bits long - the logical
destination address. The MMU translates this logical address
to a possibly large physical number, but the software still operates in a
64k space.
This has a subtle
implication - logical address space is a valuable commodity that must be conserved.
Wasting physical memory isn't so
bad, since the 64180 can deal with up to 1mb. As an example, suppose that
your program will have three banks (COMMON 0, BANK, and COMMON
1). If the program is large you might want to bank it in and out of the BANK
area, leaving COMMON 1 for data. If BANK is too large, you
could be left with little data space - it is important to make BANK as small
as feasible to maximize the (in this case) unbanked data.
Language Support
Despite the fabulous
extra power offered by the 64180's MMU, we've all been making do with Z80
assemblers and compilers. Sure, some claim
to support the new processor's extended features, but in truth, until recently,
that support has been minimal.
Just what features
are important in a 64180 assembler or compiler? Certainly it should be fast,
efficient, and all that, but more than
anything else the language should give you some sort of way of handling the
MMU.
There are two
related but different aspects to MMU management. The first is to provide some
sort of mechanism to control the MMU with as
little programer help as possible. An ideal solution would be a smart compiler
that simulates a nearly linear huge address space. The
second is to provide output files that contain compiled code and debugging
records in some manner that supports current 8 bit tools (like
the PROM programmer), but that accounts for the large address spaces.
Taking these
two criterion separately, especially with a C compiler we'd really like some
method of compiling an ordinary C program in
multiple banks. Sure, you might have to tell the compiler or linker about
your memory configuration, but ideally the tools should segment
and package functions into memory banks as needed. Even better, we'd want
it to remap the MMU automatically. Just like working with Turbo
C, we would like to be able to invoke a function through a conventional function
call, without worrying about its location in memory.
The second requirement
is not quite so obvious. How will you burn ROMs for the final project? If
the compiled/assembled code exceeds 64k,
there may be a problem with using standard Intel hex records for output. Every
ROM programmer in the world takes Intel hex input, but the
format only supports 16 bit addresses.
One solution
is to divide the source program into many separately compiled small pieces.
This is especially hard in C, since the linker
will not be able to resolve calls between pieces. Another approach is to insure
that the compiler or assembler can produce "Type
2" Intel records. Whenever the code crosses a 64k bank the linker
could output a type 2 record to specify a new segment address
(physical address shifted right 4 bits). This does imply that the linker can
handle large physical addresses, and the PROM programmer can
accept type 2 records.
Decent debugging
files are just as important as useful PROM files. You can't use an emulator,
simulator, or monitor to debug the code if
the debug records are inadequate. Suppose you wish to display the value of
a variable. The debugger must know the physical address of that
quantity, since only the physical address is constant. Remapping the MMU changes
its logical address, and at times no logical address
might correspond to the variable.
This implies
that the software packages must maintain both logical and physical addresses
for all lines and symbols. Compiling, say, jumps
requires logical addresses. All jumps and calls take logical addresses as
arguments (since they can only support a 16 bit number).
Physical addresses are needed in the debugging records so debuggers can unambiguously
resolve the location of symbols, functions, and line
numbers, all whose logical address changes with the current MMU setting. Assemblers
Most 64180 programs
written in assembly language control the memory manager by tediously issuing
many MMU control instructions. The
programmer must first decide exactly what configuration logical memory will
assume, and then come up with CBAR, BBR, and CBR values for
every possible combination of banks. Then, the code must send these values
out to change maps. Needless to say, this takes a lot of work.
Softools (8770
Manahan Drive, Ellicott City, MD 21043, (301) 750-3733) came up with an interesting
approach that eliminates most of the
work. Their SASM assembler and linker will automatically drop in all the code
needed to bank a program. In effect, this means you can
write code as if the 64180 had a 1 mb linear address space.
Like most good
assemblers, SASM supports lots of named segments - up to 256. Most of the
time we assembly programmers just need a CODE,
DATA, and ASEG segment, but SASM's segmentation lets us break a program into
mapped and unmapped sections. When using SASM on large
programs, you can assign any segment or segments to have a "mapped"
attribute, identifying those that require some MMU
manipulation to bring them into the address space.
Segments are
the key to SASM's mapping scheme. The linker identifies how much data the
program uses and the number of bytes used for
unmapped code (that which must never be mapped out). It computes CBAR to define
the characteristics of the runtime logical address space:
COMMON 0 being just big enough to hold all the unmapped code, COMMON 1 containing
the data, and the rest, the BANK area, is allocated to
mapped routines.
The linker groups
all mapped segments together and starts to assign both logical and physical
addresses to each routine. Whenever a
routine will exceed the size of the BANK area the linker moves it to the start
of a new BANK area. It then converts all jumps and calls
between banked areas to transfers to code that manages the MMU in COMMON 0.
When finally
linked, the program has three parts - a COMMON 0 non-mapped (i.e., always
in the address space) area which typically contains
startup code, frequently-used routines, and SASM's banking code. COMMON 1
is usually your data area. The BANK area contains most of the
program code. Calls between these banked routines will cause remapping as
needed to bring in ones that are not currently visible in the
address space.
For example,
suppose the program is as follows:

On
a 64180 the reset jump is at 0, so it makes sense to put the unbanked code
(vectors and main) at 0. The data area cannot be banked
(especially the stack!) and is traditionally in high memory. Suppose the code
that starts at physical location 0 is to go into ROM, and
the data that starts at physical 40000h is in RAM. The linker will first divide
the logical address space based on the unmapped memory
requirements: main and vectors need 3780h bytes starting at location 0, and
data occupies 3200h at the end of the logical space. Bearing
in mind that the mapping resolution of the 64180 is 4k, memory thus looks
like:

All
the logical address space from 4000 to bfff is available to routines that
can be banked. If the sum of the banked sizes is less than
the BANK logical area, then no mapping need take place. In our example, however,
banked routines need some 64k, much more than the
available logical space. If CBAR is C4 (COMMON 1 at c000 and BANK at 4000),
SASM will assign addresses as follows:

For
sub1 SASM assigned a logical and physical address of 4000 - reasonable, since
this is the first free spot after COMMON 0. sub1 is in
BANK, so a BBR value is required. BBR=0 will map 4000 to 04000. sub2 follows
sub1, again with BBR=0. So far, no surprises.
sub2 ends at
b8f0, practically right before the logical start of data (c000). There is
no way sub3 can fit, since sub3 is 1200 bytes long.
SASM therefore put sub3 at logical address 4000 (the same as sub1). sub3 follows
in physical memory at 0c000 (the next physical address
rounded up to a 4k boundary). BBR equals 08. sub1 and sub3 occupy the same
place in logical address space (4000), but different physical
addresses. To get to sub3, BBR must be set to 08 and a logical address of
4000 issued.
While sub3 is
very short, leaving plenty of room for code in the same bank map, sub4 is
not. sub3 and sub4 will not both fit into BANK
together, so SASM once again reset the logical address to 4000. sub4 comes
after sub3, rounded up 4k, and a BBR of 0a is assigned.
(Remember the math - BBR * 4096 + logical = physical, so 0a * 4096 + 4000
= e000). sub5 fits into the space between the end of sub4 and
COMMON 1, and is so assigned.
SASM's linker
generates address assignments as we've just seen, but how are calls and jumps
between subroutines handled? Obviously, if
sub1, sub3, and sub4 all reside at logical address 4000, a simple CALL 4000
will not always resolve properly. As mentioned earlier, SASM's
linker converts all inter-BANK calls and jumps to a transfer to a jump table
which is usually linked into COMMON 0. In particular, if sub4
were to call sub1, the following will be automatically substituted for the
call instruction:
call bank_table+x
; invoke MMU handler
db BBR ; BBR of sub1
DW sub1 ; address of routine to call
The code in bank_table stores the current BBR value and return address on
a local stack, remaps the MMU by outputting the indicated BBR,
and then transfers to the logical address supplied as a parameter. Returns
operate in a reverse procedure, being vectored to another
Softools-supplied routine to reverse the mapping.
What might not
be entirely obvious is that SASM does it all. Once you tell SASM's linker
where ROM and RAM are (which has to be done for
any linker) it automatically allocates logical and physical addresses. The
linker also replaces the calls and jumps as shown above. SASM
does offer options to control memory allocation and the like, but in most
cases these are not needed.
This means that
you can write large programs without ever considering the MMU. SASM takes
care of it all. There's an interesting subtle
implication - you can link a 256k program to take up only 8k or so of logical
space! Assign 4k for COMMON 0 and 56k for data. SASM will
bravely partition the 256k code into 50 or more sections, each of which will
get remapped through the 4k BANK area. The mapping overhead
might get high, but logical address space will be conserved.
Since SASM partitions
the program during the link phase, it can save the addresses of all symbols,
line numbers and other parameters in a
debug file. Symbols' physical addresses are stored in the debug file, maintaining
true addresses regardless of the MMU mapping. If the
debugger (emulator or monitor) can handle physical addresses, then you can
access any routine, variable, or source line number at any
time, without manually remapping to bring the desired value into logical address
space. C Compilers
As we write this,
only three compilers currently support 64180 bank switching. Archimedes Software's
C180, Whitesmiths' C, and Software
Development System's C all automatically generate code for large memory models.
Manx (MANX Software Systems, P.O. Box 55, Shrewsbury, NJ
07701) will soon have a compiler, and no doubt others are on the way.
Archimedes (2159
Union Street, San Francisco, CA, 94123 (415) 567-4010) approach to memory
management is much like that used by SASM. As
you write your C code you do not need to be especially concerned with the
MMU. There are no special procedures to use or functions to
invoke.
Before linking
the compiled object files various parameters must be passed to the linker
in its indirect command file. The first are
values for CBAR and CBR. It is the programmer's responsibility to determine
exactly the memory configuration, and to compute these simple
values.
In addition to
the MMU register settings, the programmer must provide the linker with a table
of modules (i.e., file of source code, each
of which may contain several functions) and names. If the module is not to
be banked, that must be indicated as well.
The memory model
supported by the compiler puts all non-banked functions into COMMON 0, the
banked code into BANK, and data areas into
COMMON 1.
The linker generates
a table ("FLIST") of data about every mapped function in
the program. For each function, FLIST gives an
encrypted BBR value, logical start address, and bank number. FLIST is a sort
of global cheat sheet, located in COMMON 0 so it is never
mapped out, that describes every function's logical and physical address.
The linker replaces
all mapped function calls with:
ld hl,FLIST
entry for the function
call remap_code
The remap_code
extracts pertinent data from the FLIST entry and remaps the MMU as needed
before branching to the function's logical
address.
The beauty of
using an FLIST table is that pointers to functions will work - the pointer
becomes an FLIST pointer. With FLIST always
mapped in, indirect function invocations will work even to mapped functions.
The Archimedes
compiler does produce a good debugging file, which contains useful information
about the physical address of every
function. The information is stored in FLIST. A conversion routine easily
extracts the real physical address of each function, line
number, and global symbol.
Whitesmiths (733
Concord Ave., Cambridge, MA 02138, (617) 661-0072) took a somewhat different
approach to using the MMU. When writing C
code for this compiler, all calls to mapped functions must be specified as
FAR calls. This directs the compiler to generate the proper
code to bring the function into the map and execute it.
The called function
needs no special handling, since it can be called as either a FAR or as a
near. For example, if a function in one
module invokes another in the same module, it can use a conventional call
structure. Only if a different function, possibly located
outside of this bank, calls the same function, does the extra call overhead
have to be inserted.
A typical call
sequence looks like:
@far int sub();
main()
{
sub1();
}
The function
is FAR in the definitions, and then all references to in within that module
generate banked calls.
All banked calls
do produce overhead, both in code size and speed. The Whitesmiths approach
eliminates the overhead in cases where it is
not needed. Archimedes, on the other hand, vectors all banked function calls
through FLIST, even if both the caller and callee are co-
resident in BANK.
Whitesmiths uses
the indirect linker command file to indicate the location of every banked
function. The programmer provides both the
logical and physical addresses of each of these functions. Again, the peril
here is having to go through iterative modifications of these
parameters during development.
To date, no compiler
is smart enough to automatically set banked addresses. Perhaps soon this will
change.
The compiler
generates a call to library routine c.libc to do the bank switching. Space
is allocated on the stack for the return address
and return BBR. c.libc gets a "far pointer" to the function
so it can reset BBR and the logical address.
Uniware, from
Software Development Systems (4248 Belle Aire Lane, Downers Grove, IL 60515
(800) 448-7733) implements bank switching by
simulating linker overlays. In other words, the compiler and linker are not
even aware that the 64180 processor has an MMU; each mapped
function appears to be an overlay.
Like the other
compilers, the Uniware compiler breaks memory into three sections. Only the
middle area is mapped dynamically. Your
initialization code must preset CBAR and CBR to their static values.
The indirect
linker file specifies which functions are to be mapped, and each function's
BBR value. The linker changes the normal call
sequence to:
ld c,<BBR>
ld IY,<function logical address>
call _call
_call is a low
level routine in COMMON 0 that remaps the MMU and vectors off to the function.
You must give
the linker much more information than for the Archimedes product, so Uniware
is a bit harder to use. One of the nice side
benefits of this approach, though, is that it is directly applicable to Z80
bank switched applications. You just have to modify the _call
code to handle your proprietary hardware. The Emulator
In the embedded
world generating code is only a small part of the development battle. Somehow
it must be tested and debugged. The only
suitable tool for embedded debugging code is an In Circuit Emulator, since
only the emulator lets you interactively isolate bugs in a
ROMed environment.
Like compilers,
64180 emulators are all basically extensions of technology developed for the
Z80. After all, the timing is similar and the
software is practically the same. Unfortunately, the extra four address bits
found on the 64180 can cause lots of emulation problems.
On a Z80 the
logical and physical address space is the same. Not so for the 64180 - only
by knowing the MMU values can the translation
take place. It's therefore crucial that the emulator can handle physical addresses,
since only physical ones never change.
While this seems
fairly obvious it can be difficult to implement. Emulators use the 64180 for
all target memory accesses, so the machine
cycles are identical to those expected with a processor in the socket. A translation
from desired physical address back to logical, CBAR,
CBR, and BBR must take place, since the 64180's code can only issue logical
addresses.
In other words,
if memory at physical address 20000 is to be displayed, then some routine
has to figure out settings for all three MMU
registers, plus a logical address, that the 64180 can use to access the memory.
Not a trivial task.
As users we don't
care what the emulator does or how it works. All we're concerned with is the
debugging interface - the source level
debugger (SLD) that runs on a PC and communicates with the emulator over RS-232.
If we type DISPLAY SYMBOL FOO, then we want to see the
value of FOO, no matter where it is or how the MMU is setup. The SLD must
therefore know about FOO's physical address.
Fortunately,
all the products mentioned generate physical symbol addresses in the debugging
files. The SLD can send these values down to
the emulator and let it deal with coming up with the proper address.
This does mean
that the SLD/emulator interface is completely linear, like the 68000's. You
can randomly access any location in the
target's memory just by typing in the right address.
What if you wish
to see a logical address? Is this important? Herein lies a source of confusion.
Only physical addresses unambiguously
identify each public symbol and line number. Your program works through logical
addresses - the two are not the same or even similar.
Looking at disassembled code, you might see a LD A,(1000). The 1000 is logical
- its physical equivalent depends on the current MMU
mapping.
Avocet's Softaid
UEM emulators get around this problem by letting you suffix any address with
a tilde to indicate that the logical address is needed, rather than the default
physical. Of course, the emulator will use the current MMU setting to access
the memory, so if the MMU is not set up as it would be when executing that
instruction, the data may not be correct. Normally this is not a problem -
you debug in you execution context, rather than randomly hunting through code.
64180 registers
are 16 bits long. When used as pointers, they form logical addresses, creating
the same sort of problem just mentioned.
Again, when displaying the contents of a register pointer, that will be logical.
If you ask for a dump of memory at the address in HL,
what will result? The correct solution is to use indirect register references
as logical (saving the bother of suffixing a tilde all the
time), since this is what the programmer really wants.
In C, an automatic
pointer will be stored as a 16 bit value on the stack. Suppose, while debugging,
you wish to dump *ptr? In other words,
display the data pointed to by ptr, which is presumably on the stack. Again,
only one correct solution exists: get the stack pointer,
convert it to physical using the current MMU, extract the 16 bit value of
ptr from the stack, make that physical, and then access the
destination address. Conclusion
The 64180 family
solves a long standing Z80 problem - that of handling more memory. Lots of
current Z80 applications can be easily ported
to the 64180 to take advantage of the larger memory model and high integration
peripherals. Don't try to get away with Z80-style
development tools - select assemblers, compilers, and debuggers that exploit
the 64180's resources to ease your development efforts.