|
3
Gordon Drive, P.O.Box 1347 Rockland, Maine 04841 U.S.A.
|
|
© 2004 Avocet Systems, Inc.
|
Call
Us Today at 207-596-7766 ("Picton Press")
|
|
Avocet Systems, Inc. : The Complete Solution for Embedded Systems
Development Tools
|
|
|
Hints
Prefetchers
Abstract
Most modern CPUs (like the 80188, 68xxx, PIC and others) have prefetchers on-chip
to increase peformance. They can cause no end of debugging trouble, though...
Most 16 CPUs are really a combination of two tightly integrated processors.
An Execution Unit runs the code. A Bus Interface Unit manages the processor's
pins, always trying to fill a small prefetch queue with the next instruction
to execute. Memory is slow; if the instruction is already on-chip quite a bit
of time will be saved.
The Bus Interface Unit (BIU) sits between the Execution Unit and the device's
pins. If the CPU were a simple Z80, the BIU would just pass memory requests
from the Execution Unit to the outside world. Instead, prefeching CPUs exploit
idle bus times by adding intelligence to the BIU.
Code generally executes from a low address to a bigger one. Sure, jumps and
calls reverse the monotonically increasing fetch sequence, but even in a short
loop more often than not the next instruction byte is located right after the
one just fetched. The 80188's BIU uses this fact to keep the CPU to memory interface
busy (i.e., maximize the bandwidth).
The BIU is really rather stupid. It just blindly keeps fetching bytes from ROM,
storing them in a little FIFO between it and the Execution Unit. When the EU
is ready for the next instruction it might already be on-chip in the FIFO. Memory
fetch delays are thus avoided. The FIFO is small but most of the time the next
byte is there when needed.
Once in a while the CPU will decode and execute some sort of branch operation.
Presumably the BIU will have at least partially filled the FIFO with bytes located
sequentially beyond the jump; bytes that just are not needed. Jumps and calls
flush the FIFO, erasing these unneeded entries. The BIU then starts fetching
from the new execution address. Program transfers therefore essentially stall
the CPU; the processor must wait for the first instruction at the jump destination
before proceeding, just like any simple non-prefetching computer would. Soon,
however, the BIU will again fill the FIFO, keeping a bit ahead of the EU's needs.
Some instructions read or write data to the memory array. A load or store operation
causes a momentary disruption in normal prefetching sequence. Loads and stores
work around the FIFO; they temporarily suspend prefetching, transfer the data,
and then resume without corrupting the FIFO's contents.
Prefetcher Perils
Prefetchers cause two sorts of emulation problems. They often hopelessly confuse
the real time trace data and sometimes cause incorrect breakpoint operation.
The CPU's Bus Interface Unit is fairly unintelligent. It constantly issues requests
for the next sequential instruction until a jump or other program transfer invalidates
the prefetched but not executed data. The real time trace in some emulators
cannot deal with these erratic and sometimes incomplete instruction fetches.
When the processor prefetches an instruction, a jump that is pending in the
internal prefetch queue could stop the fetch even before the entire instruction
is read. If the trace system doesn't model the processor's internal operations
the displayed data will be meaningless. Softaid spent almost two years developing
an algorithm that models the processor's operation and then correctly displays
the trace data.
If you look at the trace data collected by our emulators you'll see that the
"index", or position of a line in the trace data, is not monotonically
increasing. This is due to the algorithm, which must move data around to properly
disassemble the trace data. If a move-from-memory instruction is found, for
example, the algorithm will look ahead in the trace data to find the bus cycles
representing the data transfers and align them with the instruction that did
the deed, making the programmer's life much simpler.
In addition, the algorithm deletes prefetched-but-unexecuted instructions from
the trace display since these instructions were not executed and will only confuse
the programmer. (Note that in "Raw" display mode, the algorithm is
disabled so you can see every bus transaction, just as it took place).
Since the Bus Interface Unit fetches ahead, regardless of what instruction is
actually being executed, it may fetch an instruction that is never or only rarely
executed, causing problems with breakpoints. Consider the loop:
label: < loop code>
jnz label
< more code>
If you place a breakpoint on < more code> every loop iteration might cause
a breakpoint - even if the jump is taken. The emulator only sees the fetch and
can't distinguish a fetch that will be executed from one that will not be. Softaid's
breakpoint circuits have a state machine that models bus cycles in real time
to see if the breakpoint really should be taken. The breakpoint will occur only
if < more code> is really executed.
|
|
|