3 Gordon Drive, P.O.Box 1347 Rockland, Maine 04841 U.S.A.
Find Tools for Your Chip


Subscribe to our Newsletter

© 2004 Avocet Systems, Inc.
Call Us Today at 207-596-0080
Avocet Systems, Inc. : The Complete Solution for Embedded Systems Development Tools
Banking Basics

Copyright 1996, Jack G. Ganssle

Abstract

What do you do when you run out of address space? Go to a bigger processor? Maybe, but another option is building a memory manager.

Nelson Rockefeller, when asked how much money is enough, reportedly replied "just a little bit more." We poor folks may have trouble
understanding his perspective, but all too often exhibit the same response when picking the size of the address space for a new design.
Given that the code inexorably grows to fill any allocated space, "just a little more" is a plea we hear from the software people all too
often.

Is the solution to use 32 bit machines exclusively, cramming a full 4 GB of RAM into our cost-sensitive application in the hopes that no
one could possibly use that much memory?

Though clearly most systems couldn't tolerate the costs associated with such a poor decision, an awful lot of designers take a middle
tack, selecting high end processors to cover their (ahem) posterior parts.

32 bit CPUs have tons of address space. 16 bitters sport (generally) one to 16 Mb. It's hard to imagine needing more than 16 Mb for a
typical embedded app; even 1 Mb is enough for the vast majority of designs.

A typical 8 bit processor, though, is limited to 64k. Once this was an ocean of memory we could never imagine filling. Now C compilers let
us reasonably produce applications far more complex then dreamed of even a few years ago. Today the mid-range embedded systems I see
usually burn up something between 64k and 256k of program and data space - too much for an 8 bitter to handle without some help.

If horsepower were not an issue I'd simply toss in an 80188 and profit from the cheap 8 bit bus that runs 16 bit instructions over 1 Mb of
address space. Sometimes this is simply not an option; an awful lot of us design upgrades to older systems. We're stuck with tens of
thousands of lines of "legacy" code (sounds more like the name of a car than a technical term) that are too expensive to change. The code
forces us to continue using the same CPU. Like taxes, programs always get bigger, demanding more address space that the processor can
handle. Whatcha gonna do?

Perhaps the only solution is to add address bits. Build an external mapper using PLDs or discrete logic. The mapper's outputs go into high
order address lines on your RAM and ROM devices. Add code to remap these lines, swapping sections of program or data in and out as
required.

Logical to Physical

Add a mapper, though, and you'll suddenly be confronted with two distinct address spaces that complicate software design.

The first is the physical space - the entire universe of memory on your system. Expand your processor's 64k limit to 256k by adding two
address lines, and the physical space is 256k.

Logical addresses are the ones generated by your program, and thence asserted onto the processor's bus. Executing a MOV A,(0FFFF)
instruction tells the processor to read from the very last address in its 64k logical address space. External banking hardware can
translate this to some other address, but the code itself remains blissfully unaware of such actions. All it knows is that some data comes
from memory in response to the 0FFFF placed on the bus. The program can never generate a logical address larger than 64k (for a typical 8
bit CPU with 16 address lines).

This is very much like the situation faced by 80x86 assembly language programmers. 64k segments are essentially logical spaces. You can't
get to the rest of physical memory without doing something; in this case reloading a segment register.

Conversely, if there's no mapper then the physical and logical spaces are identical.

Hardware Issues

Consider doubling your address space by taking advantage of processor cycle types. If the CPU differentiates memory reads from fetches you
may be able to easily produce separate data and code spaces. The 68000's seldom-used function codes are for just this purpose, potentially
giving it distinct 16 Mb code and data spaces.

Writes should clearly go to the data area (you're not writing self-modifying code, are you?). Reads are more problematic. It's easy to
distinguish memory reads from fetches when the processor generates a fetch signal for every instruction byte. Some processors (e.g., the
Z80) produce a fetch only on the read of the first byte of a multiple byte opcode; subsequent ones all look the same as any data read.
Forget trying to split the memory space if cycle types are not truly unique.

When such a space spitting scheme is impossible then build an external mapper that translates address lines. However, avoid the temptation
to simply latch upper address lines. Though it's easy to store A16, A17 et al in an output port, every time the latch changes the entire
program gets mapped out. Though there are awkward ways to write code to deal with this, add a bit more hardware to ease the software
team's job.

Design a circuit that maps just portions of the logical space in and out. Look at software requirements first to see what hardware
configuration makes sense.

Every program needs access to a data area which holds the stack and miscellaneous variables. The stack, for sure, must always be visible
to the processor so calls and returns function. Some amount of "common" program storage should always be mapped in. The remapping code,
at least, should be stored here so that it doesn't disappear during a bank switch. Design the hardware so these regions are always
available.

Is the address space limitation due to an excess of code or of data? Perhaps the code is tiny, but a gigantic array requires tons of RAM.
Clearly, you'll be mapping RAM in and out, leaving one area of ROM - enough to store the entire program - always in view. An obese program
yields just the opposite design. In either of these cases a logical address space split into three sections makes the most sense: common
code (always visible, containing runtime routines called by a compiler and the mapping code), mapped code or data, and common RAM (stack
and other critical variables needed all the time).

For example, perhaps 0000 to 03FFF is common code. 4000 to 7FFF might be banked code; depending on the setting of a port it could map to
almost any physical address. 8000 to FFFF is then common RAM.

Sure, you can use heroic programming to simplify the hardware. I think it's a mistake, as the incremental parts cost is minuscule compared
to the increased bug rate implicit in any complicated bit of code. It is possible - and reasonable - to remove one bank by copying the
common code to RAM and executing it there, using one bank for both common code and data.

It's easy to implement a three-bank design. Suppose addresses are arranged as in the previous example. A0 to A14 go to the RAM, which is
selected when A15 = 1.

Turn ROM on when A15 is low. Run A0 to A14 into the ROM. Assuming we're mapping a 128k x 8 ROM into the 32k logical space, generate a fake
A15 and A16 (simple bits latched into an output port) that go to the ROM's A15 and A16 inputs. However, feed these through AND gates.
Enable the gates only when A15=0 (RAM off) and A14=1 (bank area enabled).

RAM is, of course, selected with logical addresses between 8000 and FFFF. Any address under 4000 disables the gates and enables the first
4000 locations in ROM. When A14 is a one, whatever values you've stuck into the fake A15 and A16 select a chunk of ROM 4000 bytes long.

The virtue of this design is its great simplicity and its conservation of ROM - there are no wasted chunks of memory, a common problem
with other mapping schemes.

Occasionally a designer directly generates chip selects (instead of extra address lines) from the mapping output port. I think this is a
mistake. It complicates the ROM select logic. Worse, sometimes it's awfully hard to make your debugging tools understand the translation
from addresses to symbols. By translating addresses, you can provide your debugger with a logical to physical translation cheat sheet.

The Software

In assembly language you control everything, so handling banked memory is not too difficult. The hardest part of designing remappable code
is figuring out how to segment the banks. Casual calling of other routines is out, as you dare not call something not mapped in.

Some folks write a bank manager that tracks which routines are currently located in the logical space. All calls, then, go through the
bank manager which dynamically brings routines in and out as needed.

If you were foresighted enough to design your system around a real time operating system (RTOS), then managing the mapper is much simpler.
Assign one task per bank. Modify the context switcher to remap whenever a new task is spawned or reawakened.

Many tasks are quite small - much smaller than the size of the logical banked area. Use memory more efficiently by giving tasks two
banking parameters: the bank number associated with the task, and a starting offset into the bank. If the context switcher both remaps and
then starts the task at the given offset, you'll be able to pack multiple tasks per bank.

Some C compilers come with built-in banking support. Check with your vendor. Some will completely manage a multiple bank system,
automatically remapping as needed to bring code in and out of the logical address space. Figure on making a few patches to the supplied
remapping code to accommodate your unique hardware design.

In C or assembly, using an RTOS or not, be sure to put all of your interrupt service routines and associated vectors in a common area. Put
the banking code there as well, along with all frequently-used functions (when using a compiler, put the entire runtime package in
unmapped memory).

As always, when designing the hardware carefully document the approach you've selected. Include this information in the banking routine so
some poor soul several years in the future has a fighting chance to figure out what you've done.