Refreshing
Software
Copyright 1992,
Jack G. Ganssle
Abstract
Refresh is yet
one more thing that software can, in some situation, replace.
Published in
Embedded Systems Programming, April 1992
In his wonderful
book Microcosm (Touchstone Books, NY, NY), George Gilder predicts that, with
a few exceptions, the semiconductor industry
will one day concentrate more on the production of modest volume speciality
chips than on huge runs of generic ICs. This trend is already
apparent, especially in the proliferation of I/O controllers. It doesn't matter
if you are using SCSI, Ethernet, SDLC, hard disks, stepper
motors, or any of a hundred other peripherals: at least a half dozen vendors
offer some highly integrated controller for your application.
Yet sometimes
these parts are not really appropriate. A mass produced highly cost sensitive
product like an electronic toy generally can't
tolerate the relatively high price of these chips. The classic ultra-low cost
embedded controller is the electronic greeting card. I'm
sure the designers replaced every last fraction of a yen of hardware with
smart code.
Several months
ago (December 1991) I described how in some applications you can replace a
UART with bit banging firmware. Osterberg
Consulting (San Marcos, CA) sent me a version of an interrupt-driven software
UART for the 8051. Beautifully coded, it uses only about 15%
of the processor's time at 4800 baud with an 11.0592 Mhz crystal. It's an
example of replacing expensive hardware with a clever idea and a
lot of high tech elbow grease.
Lots of other
I/O can be handled inside of the processor. Be sure you understand the magnitude
of the software before starting, though. I
aged about a decade doing a software implementation of GP-IB some years back
- it just wasn't worth the grief. Refresh
A lot of embedded
systems use what is in effect a hardware state machine to continuously write
data to displays or other hardware. For
example, a VGA card constantly copies a stream of bits from video memory to
the CRT. 60 times a second the hardware repaints the screen,
fooling your eye into thinking it sees a stable display.
Where bit rates
are lower and costs are paramount it might make sense to replace the hardware
state machines with firmware. Video,
however, is so fast it is unrealistic to consider using code to refresh the
screen. Displays
Light Emitting
Diodes (LEDs) are common output devices on inexpensive embedded systems. Both
seven segment displays and ascii arrays are
used.
Seven segment
displays are, as the name implies, seven "lines" formed
of LEDs, arranged in such a fashion that by judicious line
selection all numbers and some characters can be displayed. They are about
the cheapest way to generate numeric results. Ascii displays,
on the other hand, are composed of lots of little LED bulbs ("a thousand
points of light"), which can show any alphanumeric
value. They are considerably more expensive than the simpler seven segment
displays.
Some of these
come with internal latches and drivers, so they can essentially be just hung
on the computer's data bus. Quite a few have no
internal electronics. The designer must provide both a driver (an amplifier
that converts the computer's logic levels to much higher LED
currents) and an interface to the computer bus.
The interface
is quite a problem. Consider the case where a system includes 8 digits of
seven segment displays. Each one needs a high
power driver chip and a latch to hold the value written to the digit by the
program. This could amount to as may as 16 chips!
A better solution
(one that is used by most cheap systems including digital watches) is to arrange
the displays in a matrix. The seven
segment display is, after all, little more than seven diodes with one end
connected together. 8 wires come from the package: 1 common
connection point (the "digit enable" line), and 7 individual
segment wires.
If we're putting
8 of these displays in a system, tie each of the seven segment leads of each
package together. The result is a new level
of abstraction: a package of 8 displays, with 7 segment enables and 8 digit
enables coming out. If you put power on one of the digit
enables and a seven segment code on the segment bus, then one display, corresponding
to the powered digit line, will light.
Connect the 8
digit enables to 8 high power drivers (one IC), and to an octal latch on the
computer bus (one more chip). Tie the 7 segment
bus lines to another driver and latch (2 more chips). Now we're talking 4
chips instead of 16.
The firmware
turns on any single digit by sending a seven segment code to the segment latch
and a 1-of-8 select to the digit latch. The
computer can obviously turn on any one display at a time, but there is no
provision to turn them all on simultaneously.
The secret lies
in the eye's persistence. The software should turn on one digit for a few
milliseconds, then do the next one, and so on
through the entire array. By repeating this cycle at a high speed the eye
is fooled into thinking all of the digits are on, when really
only one is at any point in time. It's a little like TV, where a complete
picture is formed by a rapidly moving dot.
You can buy controller
chips to handle this display multiplexing, but why bother? Use spare processor
time (if any!) to sequence the
refresh cycle.
Lashed up as
described, the entire array of displays looks like two I/O ports to the code.
The digit select port is always all zeroes with
only a single one set, the position of which selects one of the 8 displays.
The segment port is just the seven segment code required by
the currently-selected display.
Use a timer to
generate a sequence of interrupts. What? You don't have a timer? You can sometimes
create a "fake" interrupt by
doing calls in the code's main-line, but it can be tough to insure calls come
often enough in all operating modes to keep the displays
flashing fast enough.
In general I'd
take a timer over any other peripheral. With a timer you can do wondrous things;
generate accurate bit patterns, run a
preemptive real time operating system, and the like. A timer can help make
up for a lot of deficiencies in the hardware, but it's awfully
hard to make the software run well in the absence of a timer.
As an aside...
no matter how small your embedded system is, seriously consider putting at
least a simple real time operating system in. A
tiny RTOS uses practically no resources (other than a timer interrupt and
a bit of memory). An RTOS is ideal for responding to real time
events. However, far too many embedded systems start off with no RTOS only
to have one shoehorned in in desperation late in the
development cycle. It's a lot easier to sta`t of with an RTOS and use only
a little of its power than to rewrite the code to adopt to one
later.
To resume: on
each timer interrupt simply change the digit port to select the next display.
Put the appropriate segment code in the other
port. Then return. The interrupt service routine will be short and fast, demanding
little of the processor. The 12 chips we saved earlier
cost little in CPU overhead.
Of course, use
a sane approach to handling the ports. Rule 1 of interrupt handling is to
keep the service routine short and simple! Too
many applications force the ISR to convert an ascii or numeric code to the
segment selection values on every interrupt. This is foolish.
Build a little
table with one entry per digit (8 in the case we've been discussing). The
table is global to both the interrupt service
routine and to a driver called every time the firmware wishes to change the
displayed value.
The driver most
likely will accept an 8 digit string of character or integer data from the
calling routine. It converts this to 8 segment
values, one per digit, and places these in the table. It's short and sweet.
The ISR looks
like:
Push registers
Put a zero to digit port
Load pointer to table
Load value from table[pointer]
Put value to segment port
Increment pointer (modulo table) and save
Load digit byte
shift left and save it
put to digit port
restore registers
return
On some processors
the ISR will be not many more instructions than the 11 steps shown.
Don't forget
step 2. While not strictly needed (depending on the system's speed), if left
out the incorrect value will be written to one
digit for a few microseconds, perhaps creating a ghost image.
The refresh rate
is a function of the number of displays (more displays need a faster update)
and the persistence of the eye. For 10 or so
digits I find a 1 millisecond update rate more than adequate. A 1 msec ISR
that takes, say 15 microseconds to run, requires only 1.5% of
the CPU's time.
Though I've focussed
on LED displays, the same technique works on Liquid Crystal Displays (LCDs).
However, a lot of big LCD displays with
multiple ascii characters include on-board refresh, removing any need for
software support. DRAM Refresh
Dynamic RAMs
(DRAMs, pronounced "Dee RAM") are the cheapest form of high
speed rewritable data storage. They are composed of a
single transistor per bit. Each "gate", or transistor input,
is insulated from the substrate by a tiny non-conductive deposit.
This forms a capacitor which memorizes the last value written to the transistor.
Obviously, there
are no perfect insulators. The capacitance of the junction is so tiny that
the charge bleeds off within a few
milliseconds. In other words, without help, all of the cells in the DRAM forget
in the blink of an eye.
Like LED displays,
DRAM cells are arranged in an X-Y matrix. A simple read from every row (X
line) once every few milliseconds suffices to
recharge the capacitors and keep the contents of the device intact. This refresh
cycle is crucial to proper operation of any DRAM,
although it adds a layer of complexity to the hardware.
If it seems that
DRAMs are a tenuous affair, remember that there is good reason for the approach.
A DRAM cell needs only a single
transistor; three less than the simplest static RAM. As a result, DRAMs always
offer much higher memory density. The technologies always
move more or less in lockstep, with static densities about 4 years behind
that of dynamics.
Conventional
refresh controller ICs include a counter that generates all row addresses
and feeds these to the DRAM chips as required.
Several chips are used, as modern 1 mb DRAM chips need a 9 bit refresh cycle
(512 row addresses). Most 1 mb DRAM chips need all 512 row
addresses every 8 milliseconds to guarantee data retention.
A lot of embedded
systems eliminate the need for a distinct DRAM controller by using a DMA channel
to manage the refresh. The original PC
works this way. It's interesting to look through the BIOS listings. RAM is
just not available until the BIOS programs the DMA controller
to start refresh cycles going.
DMA is the perfect
solution to the refresh problem. Generate null DMA reads from sequential addresses.
Program the controller to run over
and over, without computer intervention. This is especially attractive on
modern high integration controllers like the 80186 with built-in
DMA channels.
Still, some systems
might not have a spare DMA channel. It is possible to generate refresh completely
under software control, but pay
careful attention to the firmware's timing. Though I've never built a system
around software refresh, I've seen several successful
implementations.
The trick is
to write a really tight interrupt service routine that does little more than
a read from incrementing addresses - fast.
A timer invokes
the refresh interrupt service routine. The interrupt time is dictated by the
specifications of the DRAM chips. Take the 1
mb Hitachi HM511000 for example. It requires 512 refresh cycles, all of which
must be completed in 8 milliseconds. This works out to one
complete interrupt service every 15.6 microseconds. While blazingly fast,
it is not (quite) impossible. Be wary of other interrupting
devices that could create untenable latency problems.
The ISR must
be highly optimized to present minimal CPU overhead. Typically, it should
contain the following steps:
save processor
state
load next refresh address
do a read from that address
increment and store refresh address
restore processor state
return
If your entire
application is in assembly language you can greatly shrink the ISR by dedicating
a register to the refresh address. This
removes step 5. In Z80 assembly language, the ISR could look like:
isr:
push af ; save
processor state
ld a,(bc) ; read and refresh
inc bc ;
next refresh address
pop af
reti ;
ret from interrupt
Register pair
BC is the refresh address. Though the DRAMs really only need a 9 bit counter,
it is much faster to just let BC wrap through
16 bits.
An assembler
that counts T states is really handy in this sort of application to ease figuring
how long the ISR takes to run. The old SLR
assembler had this feature, but I don't know of a modern product that supports
it. Conclusion
Don't get me
wrong. I am a firm believer in using complex I/O controllers in most applications.
However, where appropriate, software can
and in some cases should replace the external hardware.
Actually, my
biggest objection to these big I/O chips is the seemingly hundreds of control
registers some of these monsters sport. We
programmers can spend weeks trying to convert a cryptic 50 page data sheet
into working code. Someday, the vendors will recognize that
their job is not to make chips, but to provide value to the customer. Then,
they'll give us useable canned code packages along with the
raw hardware.