Debugging
ISRs - Part 1
Copyright 1996,
Jack G. Ganssle
Abstract
This is part
1 of a two part series on debugging interrupt service routines.
Published in
Embedded Systems Programming, May, 1996
Few embedded
systems are so simple they can work without at least a few interrupt sources.
Few designers manage to get their product to
market without suffering metaphorical scars from battling interrupt service
routines (ISRs).
There's no science
to debugging these beasts, which are often the most complex part of any real
time system. Few college course address
ISRs at all, let alone debugging scenarios. Too many of us become experts
at ISRs the same way we picked up the secrets of the birds and
the bees - from quick conversations in the halls and on the streets with our
pals. There's got to be a better way!
Vector Overview
One common complaint against interrupts is that they are difficult to understand.
There is an element of truth to this,
especially for first time users. However, just as we all somehow shattered
our parents' nerves and learned to drive a stick-shift, we can
overcome inexperience to be competent at interrupt-based design.
Fortunately there
are only a few ways that interrupts are commonly handled. By far the most
prevalent is the Vectored scheme. A hardware
device, either external to the chip or an internal I/O port (as on a high
integration CPU like the 188 or 68332) asserts the CPU's
interrupt input.
If interrupts
are enabled (via an instruction like STI or EI), and if that particular interrupt
is not masked off (high integration
processors almost always have some provision to selectively enable interrupts
from each device), then the processor responds to the
interrupt request with some sort of acknowledge cycle.
The requesting
device then supplies a Vector, typically a single byte pointer to a table
maintained in memory. The table contains at the
very least a pointer to the ISR.
The CPU pushes
the program counter so at the conclusion of the interrupt the ISR can return
to where the program was running. Some CPUs
push other data as well, like the flag register. It then uses the vector to
look up the ISR address and branches to the routine.
At first glance
the vectoring seems unnecessarily complicated. Its great advantage is support
for many varied interrupt sources. Each
device inserts a different vector; each vector invokes a different ISR. Your
UART Data_Ready ISR called independently of the UART
Transmit_Buffer_Full interrupt.
Simple CPUs sometimes
avoid vectoring to directly invoke the ISR. This greatly simplifies the code,
but, unless you add a lot of manual
processing, limits the number of interrupt sources a program can conveniently
handle.
General Design
Guidelines Crummy code is hard to debug. Crummy ISRs are virtually undebuggable.
The software community knows it's just as
easy to write good code as it is to write bad. Give yourself a break and design
hardware and software that eases the debugging process.
Poorly coded
interrupt service routines are the bane of our industry. Most ISRs are hastily
thrown together, tuned at debug time to work,
and tossed in the "oh my god it works" pile and forgotten. A few
simple rules can alleviate many of the common problems.
First, don't
even consider writing a line of code for your new embedded system until you
lay out an interrupt map. List each one, and give
an English description of what the routine should do. Include your estimate
of the interrupt's frequency. Figure about the maximum, worst
case time available to service each. This is your guide: exceed this number,
and the system stands no chance of functioning properly.
Approximate the
complexity of each ISR. Given the interrupt rate, with some idea of how long
it'll take to service each, you can assign
priorities (assuming your hardware includes some sort of interrupt controller).
Some developers assign the highest priority to things that
must get done; remember that in any embedded system every interrupt must be
serviced sooner or later. Give the highest priority to things
that must be done in staggeringly short times to satisfy the hardware or the
system's mission (like, to accept data coming in from a 1
Mb/sec source).
The cardinal
rule of interrupt handling is to keep the handlers short. A long ISR simply
reduces the odds you'll be able to handle all
time-critical events in a timely fashion. If the interrupt starts something
truly complex, have the ISR spawn off a task that can run
independently. This is an area where an RTOS is a real asset, as task management
requires nothing more than a call from the application
code.
Short, of course,
is measured in time, not in code size. Avoid loops. Avoid long complex instructions
(repeating moves, hideous math and
the like). Think like an optimizing compiler: does this code really need to
be in the ISR? Can you move it out of the ISR into some less
critical section of code?
For example,
if an interrupt source maintains a time-of-day clock, simply accept the interrupt
and increment a counter. Then return. Let
some other chunk of code - perhaps a non-real time task spawned from the ISR
- worry about converting counts to time and day of the week.
Ditto for command
processing. I see lots of systems where an ISR receives a stream of serial
data, queues it to RAM, and then executes
commands or otherwise processes the data. Bad idea! Simplify the code by having
the ISR simply queue the data. If time is really pressing
(i.e., you need real time response to the data), consider using another task
or ISR, one driven via a timer which interrupts at the rate
you consider "real time", to process the queued data.
An old rule of
software design is to use one function (in this case the serial ISR) to do
one thing. A real time analogy is to do things
only when they need to get done, not at some arbitrary rate (like, if you
processed commands in the serial ISR).
Reenable interrupts
as soon as practical in the ISR. Do the hardware-critical and non-reentrant
things up front, then execute the
interrupt enable instruction. Give other ISRs a fighting chance to do their
thing.
Use reentrant
code! Write your ISRs in C if at all possible, and use C's wonderful local
variable scoping. Globals are an abomination in
any programming environment; never more so than in interrupt handlers. Reentrant
C code is orders of magnitude easier to write than
reentrant assembly code.
Don't use NMI
for anything other than catastrophic events. Power-fail, system shutdown,
interrupt loss, and the apocalypse are all good
things to monitor with NMI. Timer or UART interrupts are not.
When I see an
embedded system with the timer tied to NMI, I know, for sure, that the developers
found themselves missing interrupts. NMI
may alleviate the symptoms, but only masks deeper problems in the code that
most certainly should be cured.
NMI will break
a reentrant interrupt handler, since most ISRs are non-reentrant during the
first few lines of code where the hardware is
serviced. NMI will thwart your stack management efforts as well.
Fill all of your
unused interrupt vectors with a pointer to a null routine. During debug, always
set a breakpoint on this routine. Any
spurious interrupt, due to hardware problems or misprogrammed peripherals,
will then stop the code cleanly and immediately, giving you a
prayer of finding the problem in minutes instead of weeks.
Lousy hardware
design is just as deadly as crummy software. Modern high integration CPUs
like the 68332, 80186 and Z180 all include a
wealth of internal peripherals - serial ports, timers, DMA controllers, etc.
Interrupts from these sources pose no hardware design issues,
since the chip vendors take care of this for you. All of these chips, though,
do permit the use of external interrupt sources. There's
trouble in them thar external interrupts!
The biggest source
of trouble comes from the generation of the INTR signal itself. Don't simply
pulse an interrupt input and assume the
CPU will detect it. Though some chips do permit edge-triggered inputs, the
vast majority of them require you to assert INTR until the
processor acknowledges it. An interrupt ACK pin provides this acknowledgment.
Sometimes it's a signal to drop the vector on the bus;
sometimes it's nothing more than a "hey, I got the interrupt - you can
release INTR now".
As always, be
wary of timing. A slight slip in asserting the vector can make the chip wander
to an erroneous address. If the INTR must be
externally synchronized to clock, follow the letter of the spec sheet and
do what it requires.
If your system
handles a really fast stream of data consider adding hardware to supplement
the code. We had a design here recently that
accepted data points 20 microseconds apart. Each generated an interrupt, causing
the code to stop what it was doing, vector to the ISR,
push registers like wild, and then reverse the process at the end of the sequence.
If the system was busy servicing another request, it
could miss the interrupt altogether.
Since the data
was bursty we eliminated all of the speed issues by inserting a cheap 8 bit
FIFO. The hardware filled the FIFO without CPU
intervention. It generated an interrupt at the half-full point (modern FIFOs
often have Empty, Half-Full, and Full bits), at which time
the ISR read data from the FIFO until it was sucked dry. During this process
additional data might come along and be written to the FIFO,
but this happened transparently to the code.
If we interrupted
on the FIFO getting the first data point (i.e., going not-empty), little would
have been gained in performance. Using
half-full gave us enough time to finish servicing other activities (during
which time more data could come, but as the FIFO had plenty of
empty space it was not lost), and massively reduced ISR overhead.
A few bucks invested
in a FIFO may allow you to use a much slower, and cheaper, CPU. Total system
cost is the only price issue in embedded
design. If a $5 8 bit chip with a $6 FIFO does the work of a $20 sixteen-bitter
with double the RAM/ROM chips, it's foolish to not add the
extra part.
C or Assembly?
If you've followed my suggestions you have a complete interrupt map with an
estimated maximum execution time for the ISR.
You're ready to start coding... right?
If the routine
will be in assembly language, convert the time to a rough number of instructions.
If an average instruction takes x
microseconds (depending on clock rate, wait states and the like), then it's
easy to get this critical estimate of the code's allowable
complexity.
C is more problematic.
In fact, there's no way to scientifically write an interrupt handler in C!
You have no idea how long a line of C
will take. You can't even develop an estimate as each line's time varies wildly.
A string compare may result in a runtime library call
with totally unpredictable results. A FOR loop may require a few simple integer
comparisons or a vast amount of processing overhead.
And so, we write
our C functions in a fuzz of ignorance, having no concept of execution times
until we actually run the code. If it's too
slow, well, just change something and try again!
I'm not recommending
not coding ISRs in C. Rather, this is more a rant against he current state
of compiler technology. Years ago
assemblers often produced t-state counts on the listing files, so you could
easily figure how long a routine ran. Why don't compilers do
the same for us? Though there are lots of variables (that string compare will
take a varying amount of time depending on the data supplied
to it), certainly many C operations will give deterministic results. It's
time to create a feedback loop that tells us the cost, in time
and bytes, for each line of code we write, before burning ROMs and starting
test.
Till compilers
improve, use C if possible, but look at the code generated for a typical routine.
Any call to a runtime routine should be
immediately suspect, as that routine may be slow or non-reentrant, two deadly
sins for ISRs. Look at the processing overhead - how much
pushing and popping takes place? Does the compiler spend a lot of time manipulating
the stack frame? You may find one compiler pitifully
slow at interrupt handling. Either try another, or switch to assembly.
Be especially
wary of using complex data structures in ISRs. Watch what the compiler generates.
You may gain an enormous amount of
performance by sizing an array at an even power of two, wasting some memory,
but avoiding the need for the compiler to generate
complicated and slow indexing code.
An old software
adage recommends coding for functionality first, and speed second. Since 80%
of the speed problems are usually in 20% of
the code, it makes sense to get the system working and then determine where
the bottlenecks are. Unfortunately, real time systems by their
nature usually don't work at all if things are slow. You've often got to code
for speed up front.
If the interrupts
are coming fast - a term that is purposely vague and qualitative, measured
by experience and gut feel - then I usually
just take the plunge and code the silly thing in assembly. Why cripple the
entire system due to little bit of interrupt code? If you have
broken the ISRs into small chunks, so the real time part is small, then little
assembly will be needed.
Conclusion
The wide use
of C makes assembly-competent developers a scarce resource. Embedded systems
are the last bastion of assembly, and will
probably always require some amount of it. Become an expert; like learning
Latin, it's a skill that has many unexpected benefits. Only
folks who know assembly really seem to grasp performance tradeoffs.
(OK - call me
an old fart. Flame away!)