What
Goes In Must Come Out
Copyright 1993,
Jack G. Ganssle
Abstract
FIFOs are hardware
analogs of a sort of reverse stack. Here's how they work.
Published in
Embedded Systems Programming, July 1993
It seems no computer
text is complete without a comprehensive (read "yawn") discussion
of simple data structures. Favorite subjects include stacks, queues and dequeues
(sometimes spelled "deque"). Though these are vitally important
subjects, the chasm that separates hardware and software design often makes
us forget that hardware versions of these concepts have been around for years.
Few cigar-chomping,
whisky guzzling, grizzled old TTL veterans may recognize the name "queue",
but FIFOs, functionally the same concept, have been a prime part of data communications
devices for decades. FIFO is yet another example of the wonderful way we techies
torture the English language, converting the adjective phrase "First
In First Out" to the noun FIFO. Drifting around the Embedded Systems
Conference, overhearing scraps of conversation, I could only think of poor
Professor Higgins; our nounized, acronym- ridden speech could make him abandon
his Pygmalion quest in frustration. "70% chance precip in plains of Spain".
LIFOs, Last In
First Out buffers, seem to be rarely used in embedded hardware. Actually,
I can't think of a single example, as the concept seems to violate the nature
of real time systems where you want to respond right now to an event. I'm
sure someone will read this and prove me wrong. Great!
Of course, we
all use LIFOs as software stacks, though without any unique embedded twist.
All modern processors give us some sort of hardware to deal with stacks easily.
Did you know RCA's ancient 1802 did not? The 1802 was the first ultra-low
power micro, which had no inherent subroutine mechanism. The user was forced
to write a bit of nasty code to handle calls and returns. I rather like the
system pioneered by the PDP-11, and later adopted by Motorola, of providing
an auto increment addressing mode so you can use any register as the stack
pointer.
In some cases
this is a significant advantage. A single stack, as provided by the 80x88
and other families, is not always enough. Years ago I wrote a
multitasking compiler targeted at embedded applications (it was a flop); the
doubly recursive expression parser used 3 stacks to track operands and their
modes. The 68000 version was a breeze to code; others were much more trying.
Hardware FIFOs
A surprising
number of embedded systems use FIFOs built of hardware, as opposed to a software
implementation. The most common application seems to be data synchronization.
If the processor cannot handle the incoming data stream, a FIFO may be a logical
addition to queue up the bytes till the CPU is ready for them.
An example is
the Compact Disk player. Musical data must be played at a predefined, exact
rate, yet the error correction algorithm, platter speed variations, etc.,
all conspire against this. The FIFOs keep enough data backlogged to insure
that despite all of these problems, time domain fidelity is maintained.
Specialty chip
vendors, like IDT and Cyprus, make a wide variety of hardware FIFOs in chip
form. Interestingly enough, most are 9 bit wide, reflecting their
wide usage in the data communications industry (a byte plus parity). Hardware
FIFOs are defined by their width (typically 9 bits), depth (how many 9 bit
"words" can be queued - typically 1k to 4k), and speed.
Conceptually,
a FIFO is nothing more than dual port RAM with whose addresses are created
by internal counters. One counter places bytes sequentially in the array,
while another tracks removal. Unlike a conventional RAM, you can read and
write from most FIFOs at the same time.
The devices are
quite simple, consisting of data in and out busses (permitting simultaneous
reads and writes), a write line that clocks data in, a read line to clock
data out, and one or more flag bits indicating FIFO Full, Empty or sometimes
even Half Full.
Each data byte
is strobed into the device with the write line. If the FIFO is 1k deep, than,
assuming no reads take place, after 1024 write cycles the FIFO Full flag is
asserted and each additional write generally deletes the oldest data and clocks
in the new. Thus, just like a software FIFO, it keeps only the most recent
1024 data items.
Any read pulls
the oldest saved data from the chip. If the device was full, a single read
will clear the FIFO Full flag. In most applications data is being
written and read from the device asynchronously, the Full, Empty, and Half-Full
flags may be toggling erratically to reflect the chaotic nature of the
synchronization between input and output devices.
Different devices
behave in startling different ways. I've used the IDT 7100 family extensively,
and have found that once you exceed the chip's depth, it does in fact not
start to save the newest data only. In fact, it stops accepting new data entirely.
A close examination of the data sheet shows (always, closely examine the data
sheets - it's surprising what you'll find) that once the FIFO fills you must
read it before writing to it - the read clears the oldest data out, making
room for the new byte. In effect, they've put part of the burden of making
a real FIFO on the user. It's not too difficult to implement, but requires
close attention to the timing specs.
How do you decide
to use a hardware FIFO, as opposed to a purely software implementation handled
in an interrupt service routine? I generally prefer software solutions, as
the recurring costs (that is, manufacturing costs) are essentially zero, whereas
every piece of hardware detracts from the company's bottom line. I know of
only two situations where a hardware FIFO makes sense in an embedded system.
The first is dealing with high-speed bursts of data. The second is precise
synchronizing of data to time.
Burst Data
High speed data
bursts can exceed any CPU's processing capability. A hardware FIFO is a natural
choice if the data arrives (or leaves) so fast that an interrupt service routine
cannot keep up, or would use too much of the total compute resources available.
But beware! FIFOs
are no magic bullet. They are primarily for dealing with bursts of high speed
data. If the processor cannot keep up with a data stream, than a FIFO will
help only if the line goes idle from time to time. In effect, the FIFO spreads
the data burst out over a longer time frame, giving the CPU a chance to catch
up. Sustained, high speed data, will defeat the FIFO every time.
No doubt there
is some clever way to compute the required depth of a FIFO as a function of
data rate, available processor time to read the data from the FIFO, and size
of the data burst. I'm reminded of a great example from elementary calculus:
the perfect Martini is 1 part Vermouth to 5 parts Gin. If a frat house starts
pouring x gallons of Gin per minute into a bathtub with a 1 inch hole in it,
and 3 minutes later starts adding Vermouth at y gallons per minute, when will
the mixture be perfect?
A thorough analysis
is probably impossible (of the FIFO problem - the Martini is left as a student
exercise), as it will also be a function of the other
interrupts in the system. If the CPU is shut down periodically handling lots
of high priority external devices, be sure that the FIFO is deep enough to
catch
all data arriving during the interrupts-off period.
Certainly, if
the maximum number of bytes per second is greater than the number the processor
can handle, no FIFO will be deep enough to guarantee that every data byte
is read.
I've never resorted
to a sophisticated mathematical treatment of this, as it's awfully hard to
quantify the behavior of the software. Instead, I weigh the
parameters and try to get a gut feel for likely depth of the queued data.
I know - it's crude, but so far has been effective. It's important, though,
to take some measurements after the system is complete to see just how close
your estimates are. Look at the Half-Full flag on an oscilloscope - is it
always
asserted? This would be a sure sign of marginal design. Time Synchronizing
Another important
application is precision time synchronizing of data. Processors cannot measure
the exact time of arrival of data items, as even the fastest interrupt structure
has some latency. In the best of cases, with interrupts always enabled and
the data interrupt given the highest priority, just the uncertainty due to
varying instruction execution times will cause perhaps a microsecond or more
of dither. If the data could arrive during the servicing of another interrupt,
with interrupts off, then this uncertainty could be much, much longer.
By using multiple
FIFOs the processor can use an external time base to get exact arrival times
of the data. Put the data in one FIFO, and the time base in
another. The devices will be organized logically as one very wide FIFO. The
time will be queued exactly with the data.
The reverse is
true as well. In the example of playing a Compact Disk, any time skew in the
data provided to the digital to analog converters will result in signal distortion.
You just cannot predict with any certainty the exact timing of a CPU, which
can resemble the harried executive responding to many different requests,
from interrupts to DMA to refresh generation to irregular instruction timing.
So, the processor can write data to FIFOs, while a precision time base clocks
data out of them. The FIFO then becomes a metronome-like conductor, taking
erratically timed data and playing it back with a regular beat.
Design Considerations
It's easy to
connect a FIFO to a microprocessor, but be careful to properly use the FIFO's
status flags.
Most designs
seem to connect FIFO FULL to an interrupt line. Conceptually simple, this
results in the processor being interrupted only after the entire buffer is
full. If a little extra latency causes a short delay before the CPU reads
the FIFO, then an extra data byte arriving before the FIFO is read will be
lost.
An alternative
is using the inverse of FIFO EMPTY. A single byte arriving will cause the
micro to read the FIFO. This has the advantage of keeping the FIFOs relatively
empty, minimizing the chance of losing data. It also makes the biggest demand
on CPU time, generating interrupts with practically every byte received.
I think it makes
more sense to connect FIFO HALF-FULL, if the signal exists on the FIFOs you've
selected, to the interrupt line. HALF-FULL is a nice
compromise, deferring processor cycles until a reasonable hunk of data is
received, yet leaving free buffer space for more data during the ISR cycles.
Some processors
do amazing things to service an interrupt, stacking addresses and vectoring
indirectly all over memory. The ISR itself no doubt pushes lots of registers,
perhaps also preserving other machine information. If the HALF-FULL line generates
the interrupt, than you have the a priori information that lots of other data
is already queued and waiting for processor time. Save overhead by making
the ISR read the FIFOs until the EMPTY flag is set. You'll have to connect
the EMPTY flag to a parallel port, so the software can read it, but the increase
in performance is well worth it.
In mission-critical
systems it might also make sense to design a simple circuit that latches the
combination of FIFO-FULL and an incoming new data item. This overflow condition
could be disastrous, and should be signaled to the processor.
Noise
I hate to get
back on my noise soapbox, but FIFOs seem to be particularly sensitive to electronic
noise and glitches. Some older FIFOs had no hysteresis on the write line,
so the smallest glitch could cause erratic extra writes. This was particularly
apparent with wide systems, where several FIFOs were ganged in parallel to
achieve an 18 bit or larger word. A little noise could cause the FIFOs to
get out of sequence, with each one at different fill states.
The 7202 1k by
9 bit FIFOs were particularly subject to input noise, though newer designs
have largely cured these problems. In one case, a company I know observed
a 1 nsec glitch that caused strange clocking. 1 nsec - that's so fast that
most scopes will not see it. Even though the problem has been largely eliminated
with new silicon, most vendors recommend the use of high quality multilayer
circuit boards and careful transmission line design practices around the write
line.
Conclusion
It's tempting
to resort to a hardware FIFO as a alternative to careful software design.
Resist this impulse unless you have a compelling reason! It's
surprising just how often designers add hardware to make up for either lazy
or poor software design.
If you elect
to use a software FIFOing scheme, though, model it like a hardware device.
You never know what might get changed later. Access it via device drivers,
perhaps even using a software interrupt or OS call to queue data in and out.
One mantra of the 90's, a fundamental part of object oriented programming,
is surely data abstraction.