Built-in
Debuggers
Copyright 1993,
Jack G. Ganssle
Abstract
More and more
processors have built-in debugging resources. Here's a look at what features
they offer.
Published in
Embedded Systems Programming, September, 1993
Take pity on
the poor embedded programmer. Too many "save money" by relying
on only the crudest of tools, but even those with
the largest budgets and best desires are often forced into the same trap when
using a leading-edge part that beats the tools to market.
Finally our wails
of anguish are being heard by the chip makers. We're starting to see on-board
debugging resources on quite a few new
CPUs. It's hard to dedicate processor pins to debugging, as they contribute
nothing to the end-product. However, development is such a
huge part of the cost of most embedded products that there is often little
choice but to add recurring costs to mitigate NRE. Internal
Registers
Intel addressed
debugging problems early on with the 386 microprocessor. Evidently they recognized
that the speed of the processor was
such that traditional debuggers would be prohibitively expensive. It's hard
to push electrons through a cable at 33 Mhz (or, at 66 Mhz for
the 486... soon to be 99 Mhz if IBM's rumored clock tripler part comes out).
The 386/486 has
a very complex addressing mechanism. Logical addresses get transformed to
linear addresses via a wondrously sophisticated
segmentation system. A paging unit then translates linear addresses to physical.
As a result, it's all but impossible to know what the
program is doing simply by looking at the processor's pins with, say, a logic
analyzer. Physical address 10145A0 (on the pins), could
correspond to any of thousands of addresses generated by the program, depending
on the settings of the CS selector, corresponding
descriptor, and paging setup.
I have to give
Intel credit. They dedicated a substantial number of transistors on the part
to debugging back when transistors were still
relatively expensive. This foresight has paid off for a generation of developers
(hey - I figure a generation lasts about 5 years in this
business), as nearly all debuggers, from Turbo-Debugger to many of the hardware
tools, make use of these debugging resources to set
breakpoints.
The 386/486 implements
4 hardware breakpoints using six internal registers. Four of the registers
simply hold the break address, which is
a linear address - it is the post-segmented, but pre-page translated address
generated by the program.
One register
controls the mode of each of the four breakpoints. Intel went to extremes
to make these useful debugging resources, so that
each can be an instruction breakpoint or a data break. For example, you could
set one to break on a data write to a specific address,
another to work on instruction fetches only, and a third to break on any data
read or write.
The sixth register
contains status information so the debug exception handler can determine the
source of the breakpoint.
Since the breakpoints
are handled as hardware comparators, they will work in code that resides in
ROM or in RAM, an important benefit for
debugging embedded systems.
I have yet to
see if the Pentium includes any sort of enhanced debugging capability beyond
the 386-type debug registers. Presumably it's
superscaler architecture will present yet another range of complexity in tracking
down bugs. Background Mode
Motorola has
been very innovative in their approach to both processor technologies and
on-board debugging tools. The 683xx family is a
series of processors mostly based on the 68020 core. Each part offers a tuned
I/O mix. Ideally, the family will be so large that you'll be
able to buy exactly the processor you need. I suspect the family will become
as persuasive as the 8051 and its 50 or so variants.
Motorola implements
the family as a core CPU and numerous standard I/O modules - timers, DMA,
and the like. Each module is on their CAD
system. It's easy to design a new microprocessor by using the Betty Crocker
method of extracting standard stuff from the library, shaking
it up, and letting the CAD system generate photomasks. They tell me that one
part took but a single day to design... and was correct in
its first silicon release.
Having a huge
family of slightly different parts is both a blessing and a curse. Again,
look at the 8051 family for comparison. Sure, any
part you'll need is probably there, but each time you change CPUs you'll have
to buy, at the very least, a new pod for your ICE. It's hard
to use leading edge components when the tools may lag by months.
The solution
- on on-board debugger that is standard across the entire family (it even
carries into the 68HC16 family). Each processor
dedicates 3 wires to a serial interface for debugging purposes. The entire
port is called the CPU's Background Debug Mode (BDM).
Given some simple
hardware to connect the serial lines to a PC, you can establish a communications
path to the processor that bypasses all
of its normal operation. The CPU will process a wide number of commands sent
over this port, all without altering the processor's status -
the registers, PC, and the like stay intact unless you explicitly issue a
command to modify them.
The command set
resembles that of a ROM monitor. You can read and write memory and registers,
start a program executing, and issue resets.
Normally, the
BDM is disabled. You'd hate to have your embedded system toggle to a debugging
state in the field, when no debugger is
connected! A special reset sequence enables background mode, essentially turning
on the serial port and altering the function of the
Background (BGND) instruction.
Normally, BGND
is an illegal instruction. If BDM is enabled BGND stops execution of the program
and throws the CPU into background mode,
where it services the serial commands. What could be better than this for
a breakpoint? The BDM expects you to substitute a BGND
instruction for the instruction you'd like to breakpoint on. This does imply
that you cannot break on data accesses or instructions in ROM
unless substantial extra hardware is added.
The CPUs also
have a breakpoint input, which drives the processor into background mode when
BDM is enabled. This is essential for stopping
a runaway program or adding more sophisticated external breakpoint hardware.
Numerous suppliers
make debuggers that connect the CPU's BDM pins to a PC's parallel or serial
ports. A single BDM debugger will work with
any of the Motorola processors with this resource. If you use these processors,
be sure to include the Motorola standard Berg connector in
your hardware, to make the BDM port available to a commercial BDM debugger,
no matter what your plans are for debugging strategies.
Unfortunately,
Motorola defined two different "standard" connections, one
using a 10 pin connector and the other an 8 pin
version, with quite different pinouts. The 10 pin connector offers a bit more
control of the target hardware, so is probably the preferred
connection.
Since most vendors
provide a source debugger with their BDM tools, C and assembly are both viable
prospects for BDM debugging. However,
BDM debuggers make the most sense when total code size is relatively small,
and when real time constraints are minimal. I'd be cautious
about relying on a simple BDM in any interrupt-intensive application, since
more powerful full scale emulators (or, at the very least,
logic analyzers), are essential for tracking these asynchronous events.
Embedded Systems
Technology is the exception to this rule; they make an optional trace board,
which, while not cheap, does cleverly
communicate to the BDM through an unused register in the processor. As always,
compare prices and features to get the tool that suits your
needs and budget. SMT
It seems the
embedded world is stampeding to surface-mounted components (SMT). In the good
old days each IC, resistor, and capacitor had
long leads that fit through holes in the circuit board, providing a solid
mechanical connection prior to soldering. SMT parts solder
directly to the face of the board, tenaciously holding on by virtue of the
solder alone. The benefit of this technology is reduced size:
SMT components are tiny... so small they're hard for these caffeine-shaky
hands to manage). In addition, since there are no holes needed
to mount the parts, clever designers can smear both sides of a board with
them, further reducing the size of the system.
Surface mounted
CPUs create all sorts of new challenges for debugging. Most have leads on
all four sides of their small, squarish
packages. Sometimes the "pitch" of these leads (their spacing)
is a paltry .020 inch.
Traditional emulation
techniques just don't work well in this environment. You cannot simply unplug
the CPU and cram an emulator's pod in
- the processor is soldered directly to the board. One option is to dedicate
one prototype system to development, and install a special
conversion device in place of the CPU. Emulation Technology, EDI, and others
make these adapters which solder to the processor's footprint
and provide a socket for an appropriate emulator. Be aware, though, that adapters
cost $500 to $1000, and are wispy, delicate parts that
require a magician's hand to solder in place. Don't try this at home, kids!
If the processor
is soldered in place, why not design an emulator whose pod clips over the
entire chip? That is, use a sort of inverted
female socket on the pod, and snap it onto the CPU, providing an electrical
connection to each processor's pin.
Emulator's work
by taking massive control of all processor functions. They must be able to
run short segments of emulator code on the
target microprocessor, which means the CPU must be isolated from the target
system by a buffer, so the emulator's code doesn't spuriously
effect target I/O and memory. Since there is no physical way to place a buffer
between the surface-mounted CPU and it's target resources,
the emulator must somehow disable the target CPU, replacing all of its functionality
with a processor inside of the emulator itself.
Most of the microprocessor
vendors recognized this, and provide some method of tri-stating the target
chip. The part is driven to an
inactive state, where all of its pins are non-functional. In effect, the processor
on the target system becomes a dead hunk of plastic
that is completely replaced by the emulator's own CPU.
Zilog's Z182,
for example, is a 100 pin quad flat pack (QFP) device based on the Z180 core.
Two pins are dedicated to selecting a debug
mode. Usually your system leaves these pins open and the processor enters
normal operation on power up. If an emulator is connected, it
drives the pins in to one of two debugging modes.
Mode 1 forces
almost all of the Z182's pins to a tri-state condition. A Z180 emulator, with
a special adapter, clips over the Z182 in the
target and provides all of the address, data, and other signals to the target.
Only a few lines stay active - those related to peripherals
inside of the Z182 that the Z180 does not have. So, the Z182 stays semi-active:
it's core processor is disabled. The internal I/O that is
identical to that on a Z180 is disabled. Just the new Z182 superset I/O is
alive, intercepting I/O commands sent to the processor's pins
via the clip-on plug.
This is a nice
approach, since dozens of vendors sell Z180 tools. Creating a new emulator
for the Z182 would be prohibitively expensive,
as illustrated by the chip's Mode 2, which tri-states everything on the part,
including all of the I/O. No vendor supports debugging in
this mode today.
Intel uses a
similar approach on their 80186EC microprocessor, a surface mounted variant
of the popular 186 family. It is also a 100 pin
QFP device. Instead of dedicating pins to debugging, Intel elected to share
an address line (A19) with the emulation mode selection.
Grounding A19 during reset drives the part into a tri-state condition Intel
calls ONCE mode (apparently pronounced "AHNCE").
Though the 186EC
is a lot like other members of the 186 family, it is sufficiently different
that you cannot make an adapter to convert,
say, a 186 pod to the 186EC. A new pod is needed (at the very least). Thus,
unlike the Z182, going to ONCE mode tri-states - the part is
just an expensive piece of plastic during debugging, with the emulator's CPU
assuming all processor and I/O functions. The Future
The driving force
behind electronics is an implicit guarantee that the cost of silicon always
follows a downward spiral. Transistors are
cheap; so cheap, it seems chip vendors have a hard time deciding what to do
with them. It's clear that a percentage of the transistor
budget on many new microprocessors will be dedicated to on-chip debugging
resources, to make the parts truly usable by developers.
One technology
that has been lurking for a number of years is boundary scan. Boundary scan
is an IEEE standard (IEEE 1149.1) that defines
a way to design chips for in-circuit testability. Its thrust is towards the
production test and repair end of the business.
A chip designed
to the IEEE standard will include 4 pins that implement a serial link for
communications to an outside test device.
Typically, a number of chips, all implementing boundary scan, will be daisy
chained together so the tester can send commands to any part
on a circuit board.
ICs with boundary
scan capabilities can sense the signals on each pin, so the tester can completely
probe the board purely by sending
serial commands between chips.
I've heard rumors
that some vendors are exploring expanding the technology to include debugging
assets, somewhat like the breakpoint
registers on the 386. After all, serial pins are already dedicated to test
functions; it makes sense to add debug logic, perhaps
implemented somewhat like Motorola's Background Mode. Then, the production
test logic can do double duty as a software development
platform.
Corelis Inc.
(Cerritos, CA, (310) 926-6727), just announced a boundary scan-based development
tool for the AM29200 and 29030. It's cheap;
it lets you view target resources like memory and I/O, and it supports software
breakpoints. I see this as an interesting alternative to
the extremely high priced development tools used for fast 32 bit CPUs.
Boundary scan
offers promise for the future, but it will never offer a complete solution
to the debugging process. Programmer time is
expensive. Tools that improve productivity are therefore, by definition, cheap.
Some resources, like real time trace and performance
analysis, offer lots of benefits to the developer, but are far too complex
to ever put in the silicon itself. However, built-in debugging
hardware does bring at least a minimal development system to a huge audience,
and simplifies high-powered tools.