Some relevant timing parameters to which the bus timings were designed are as follows:
Parameter | Value | Description |
---|---|---|
tPD_CPLD | 10ns | asynchronous propagation delay |
tCO_CPLD | 6ns | clock-to-output delay |
tSU_CPLD | 6ns | global clock setup time |
tRAS_DRAM | 60ns | RAS pulse width / access time |
tASR_DRAM | 0ns | row address setup time before RAS |
tRAH_DRAM | 10ns | row address hold time after RAS |
tRCD_DRAM | 20ns | minimum RAS-to-CAS delay |
tASC_DRAM | 0ns | column address setup time before CAS |
tCAH_DRAM | 10ns | column address hold time after CAS |
tCAS_DRAM | 20ns | CAS pulse width / access time |
tRP_DRAM | 40ns | RAS precharge time |
tCP_DRAM | 10ns | CAS precharge time |
tRC_DRAM | 120ns | minimum RAS cycle time |
tACC_ROM | 70ns | ROM access time |
tOE_ROM | 40ns | ROM OE access time |
tPD_573 | 20ns | 74AHCT573 propagation delay after LE or D |
tSU_573 | 5ns | 74AHCT573 setup time before LE |
tH_573 | 2ns | 74AHCT573 hold time after LE |
Below I am presenting some timing diagrams showing the relevant signals for various interesting bus cycle cases.
We are beginning with the timing of the accelerated processor bus, or the front-side bus (FSB), and proceeding on to the timing of the master port on the Mac SE bus, or the I/O Bus (IOB).
The timing diagrams are scaled for a 25 MHz FSB clock frequency and the standard 7.8336 MHz Mac SE bus.
For starters, it is instructive to look at a generic MC68000 bus cycle.
There are some details of the MC68000 bus cycle that complicate the synchronization of a state machine to the bus cycle.
Primarily, /AS falls after a rising edge of the clock but rises after a falling edge.
Since the worst-case clock-to-output delay of MC68000 is equal to half of one clock cycle,
attempting to detect bus activity by registering /AS strictly on the rising or falling edge
might result in entrance into a metastable state.
Therefore we introduce the "bus active" BACT signal. BACT is the disjunction of the address strobe asynchronously and as registered on the previous falling edge of the FSB clock.
The key useful feature of the BACT signal is that it is always valid at the rising edge of FCLK.
Given BACT, the FSB controller asserts either /DTACK or /VPA when BACT is true and removes both /DTACK and /VPA when BACT is false.
Because the Warp-SE is a variable-wait-state system, we also must introduce the ready signals which are input to the FSB controller.
In the Warp-SE, three functional units control the data flow on the FSB. These are the DRAM controller, the IOB slave port, and the sound rate limiter.
Therefore there are three ready signals, Ready0, Ready1, and Ready2. The three Ready signals are functionally equivalent and interchangeable.
Each ready signal is produced by one of the three functional units. The Ready signals must be valid at the rising edge of each clock where BACT is asserted.
For the FSB controller to assert /DTACK or /VPA, each of the ready signls must be active on at least one clock of the given CACT cycle.
Because all of the Ready signals are sampled by the FSB controller during each BACT cycle, functional units must gate their Ready outputs with their own select signals.
/DTACK is a registered output that changes strictly following the rising edge of FCLK.
Note that MC68000's "/AS inactive-to-/DTACK inactive" parameter of two clock cycles minus 5 nanoseconds is met here. /DTACK is negated approximately 1.5 clock cycles after /AS rises.
As discussed, the Ready signals do not each need to be active all at once. The FSB controller "remembers" that each Ready signal has been asserted.
This is represented by the Ready signal. Once all three individual Ready signals have been asserted, Ready becomes true and /DTACK or /VPA is asserted.
Ready is cleared along with /DTACK and /VPA once BACT is false.
When the address is in the range $FFFFXX, the FSB controller asserts /VPA instead of /DTACK.
The "/AS inactive-to-/VPA inactive" parameter of MC68000 is more stringent than, the "/AS inactive-to-/DTACK inactive" parameter,
so /VPA is additionally gated by /AS, whereas /DTACK is not.
This diagram introduces the simplest memory access type, a read from or write to ROM memory. ROM control is completely asynchronous.
The ROM /CS signal is implemented as a decode of the address bus.
Similarly, the /OE signal is an asynchronous function of LDS, UDS, and /WE.
The /OE signal is shared by the RAM and ROM, so therefore it is critical that the ROMCS signal not be tied low,
otherwise bus contention will occur during RAM reads.
The Ready signals are always high during ROM access so all ROM accesses complete with the fastest 4-cycle timing.
This diagram introduces the DRAM access timing.
At 25 MHz for a 4-clock read cycle, there are only 2.5 clock cycles (100 ns) between
the MC68k's assertion of /AS and when it latches data from the bus.
Subtracting the 25ns /AS tCO and 5ns data in tSU, that leaves only 70ns during which to initiate and complete a DRAM access,
not accounting for any RAS delay in the CPLD.
Therefore to minimize RAM access latency, RAS is implemented not as a registered output
but as an asynchronous decode of the address, /AS, and the internal RAS enable signal.
With 10ns delay in the CPLD, 25 MHz operation with 60ns DRAM is just possible.
Similarly, the RA multiplexed DRAM address bus is an asynchronous multiplexer controlled by the RASEL signal
which outputs row addresses to the DRAM array when RASEL is low and column addresses when RASEL is high.
The /CAS signal is a function of RASEL. RASEL changes after FCLK rises. If RASEL is high at the next falling edge, /CAS is asserted.
Otherwise if RASEL is low, /CAS is deasserted at the next falling edge.
"RS" is the RAM state. The RS state changes after the rising edge of the clock
and can take on values 0-7.
In RS0, the RAM is considered to be idle.
At the rising edge of the clock in RS0 a RAM cycle begins if, if /AS is asserted,
a RAM address is present, and a RAM cycle has not already occurred for this /AS cycle.
In this case, we know that /RAS has been active for at least 10 nanoseconds, so RASEL is brogught high.
This switches the RA bus from row to column addresses and RS0 transitions to RS5.
At the falling edge in the middle of RS5, /CAS is brought low. RS5 always transitions to RS6.
At the end of RS6, RASEL is brought low again, switching the RA multiplexers back to row addresses
in preparation for the next DRAM access cycle. RS6 always transitions to RS7.
RS7 is the state in which a RAM access or refresh is concluded. At the falling edge in the middle of RS7, /CAS is brought high.
RS7 transitions to RS2 if a refresh request is pending, otherwise RS7 transitions to RS0.
The states RS1 and RS2-RS4 will be discussed in association with the subsequent refresh cycle diagrams.
The RS and RAMCS signals are used to generate the Ready0 ready signal input to the FSB.
Ready0 is high if and only if RS==0 and RAMCS is active.
Also notice how, during write cycles,
it is undefined whether the cycle is conducted as an "early write" or an "OE-controlled write" cycle.
/OE is held high at all times during write cycles,
but /LWE and /UWE are asynchronous functions of MC68k's /LDS and /UDS signals.
It is undefined during a write cycle whether /LWE and /UWE will go low before or after /CAS falls.
Since /OE is held high during write cycles, the order of the /WE signals and /CAS is of no consequence.
This diagram shows the timing for a long-running RAM access,
in which the RAM read or write completes sooner than MC68k removes /AS.
There are cases in which a DRAM access completes in time for termination of a 4-clock bus cycle,
but the bus cycle is lengthened because not all of the Ready signals to the FSB controller have gone high.
If RS0 is returned to after a DRAM access but /AS remains asserted,
then the DRAM must not enter RS5-7 and thus not initiate any additional /CAS cycles.
Notice how /CAS goes high in the middle of RS7 but /RAS stays low until the end of the /AS cycle.
Using EDO DRAM allows the data bus output to be maintained while /RAS is low.
However, if FPM DRAM is used or if a refresh cycle occurs before /AS rises,
then maintenance of read data on the data bus falls to the bus capacitance and the bus hold resistors.
Therefore it is best not to prolong DRAM read cycles, even when using EDO DRAM, so that there is no possibility of
an intervening DRAM refresh cycle causing the data outputs to tristate.
Fortunately, although DRAM write cycles shadowed to main sound and video memory need to be extended
when the posted write FIFO is full, there is no need to extend DRAM read cycles.
Therefore we do not attempt to extend the /CAS pulse to fix this problem until /AS rises since the /CAS pulse
could be interrupted by a refresh cycle anyway.
To fix this problem, we could extend the /CAS pulse until /AS is high and have the
DRAM controller conform to the DRAM "hidden refresh" protocol but it is not necessary.
This diagram shows the timing of a refresh occurring after the bus and DRAM are and have been idle for at least one clock cycle.
RAM states RS2, RS3, RS4, and RS7 are used for refresh.
RS2-RS4 implement the main refresh behavior.
When a refresh request is pending at the rising edge ending RS0 or RS7 while /RAS is inactive,
RASEN is brought low and RS2 is entered.
With RASEN low, /AS activity does not cause a /RAS pulse and the DRAM controller uses the registered /RRAS signal
to initiate refresh cycles.
At the falling edge in the middle of RS2, /CAS is activated. Then at the rising edge concluding RS2, /RAS is activated
and RS2 transitions to RS3.
In RS3, /RAS and /CAS remain active, and RS3 transitions to RS4.
RS3 and RS4 serve to implement the requisite /RAS pulse width for a refresh.
At the falling edge in the middle of RS4, /CAS is deactivated. Then at the rising edge concluding RS4, /RAS is deactivated
and RS4 transitions to RS7.
RREQ is cleared after the first rising edge on which RefRAS is active.
In RS7, /RAS and /CAS remain inactive. RS7 serves to implement the requisite RAS precharge time between DRAM cycles.
RASEN is brought high again after the rising edge concluding RS7 and RS7 transitions to RS0 and the DRAM is considered idle again.
Also notice how a RASEN can only be disabled if /RAS is high or if a DRAM cycle is complete, otherwise there may be a tRAS timing violation. This constrains the timing of a refresh.
This diagram shows the timing of a refresh occurring immediately after a RAM access cycle.
Recall that a refresh cannot begin while a DRAM access is ongoing, or else an improperly-short /RAS pulse could occur.
Imagine, however, that MC68k performs many back-to-back DRAM accesses.
In this case, there would never be an RS0 in which a /RAS pulse has not already begun.
Therefore the DRAM controller must be able to begin a refresh during RS7,
immediately after a RAM access is completed but before MC68k brings /AS low again.
The timing for this case starts out slightly differently but ends the same as the refresh during idle.
Therefore the timing is only shown through S4.
The purpose of this diagram is mainly to demonstrate that adequate /RAS and /CAS precharge time exists
after the previous DRAM access is terminated before /RAS is pulsed for refresh.
This diagram shows the case where a refresh request occurs during a long-running DRAM access and the /AS cycle terminates before the refresh ends.
It is possible for a DRAM access cycle to be extended for a long time, during which the DRAM may be deprived of refresh.
Therefore we must provide for the case where a DRAM access completes and a refresh begins but before /AS ever goes high.
In this case, the rising edge of RASEN causes /RAS to go inactive, as opposed to the rising edge of /AS.
Therefore, the /RAS precharge pulse width in this case is much shorter than
a refresh occurring during idle or immediately following a DRAM access.
At 25 MHz, the /RAS precharge width is only 40ns. This is the minimum tRP for 60ns DRAM and is the tightest timing parameter in the Warp-SE.
We could purpose RS1 to add additional precharge time if necessary.
This diagram shows the case where a refresh request occurs during a long-running DRAM access and the /AS cycle does not terminate before the refresh ends.
This case is similar to the previous but there is a key difference.
/AS does not rise until after the refresh cycle completes.
Therefore if RASEN were brought high upon exit from RS7 into RS0, there may be an improperly-short /RAS pulse
terminated by the rising edge of the /AS.
Consequently RASEN enablement is held off the first rising edge during which BACT is low.
This diagram shows the case where a refresh request occurs in the "middle" of a long-running DRAM access.
The remainder of the timing is given by diagrams 9 or 10.
This diagram shows the timing of a refresh starting concurrently with the beginning of a RAM access cycle.
Here we see the timing of refresh being entered concurrently with the start of a RAM access.
In this case, there is a little bit of a race condition.
RASEN and /AS both fall following the rising edge of FCLK. /AS causes /RAS activation asynchronously,
but RASEN gates this from occurring.
Therefore the internal RASEN feedback in the CPLD must occur sooner than /AS transitions,
otherwise an erroneous /RAS pulse will be generated.
Fortunately the CPLDs intended to be used (ispMACH4000, XC9500XL) are some 10 years newer than MC68HC000,
so their speed advantage mitigates the problem.
The negation of Ready0 causes /DTACK generation and termination of the bus cycle
to be delayed until completion of the refresh.
Before showing the timing for the I/O bus slave port on the FSB, it's instructive to understand the timing of the I/O bus master controller.
This diagram shows the I/O bus VMA and "ETACK" timing.
Although most I/O bus accesses are terminated by /DTACK,
accesses to the VIA and interrupt acknowledge areas of memory are terminated by /VPA.
With MC68k having granted the bus to the accelerator,
it will no longer generate the /VMA chip select signal in respose to /VPA.
Therefore for /VMA, we must provide the /VMA signal timing.
In order to do this, an internal counter, the ES or "E state" is synchronized to MC68k's E clock cycle.
Synchronization of a state machine running from the C16M clock to the E clock cycle
is complicated by clock skew between the C16M, C8M, and E clocks.
The E clock changes following the falling edge of C8M, so E is registered at the falling edge of C8M as Er.
Then Er is registered at the rising edge of C16M as Er2.
Er and Er2 both have adequate setup and hold time to be used at the rising edge of C16M.
Er and Er2 are then used to synchronize the ES counter to the E clock phase.
In ES7, if the IO bus is active, as signified by IOACT, and /VPA has been asserted, the IO bus controller asserts /VMA
in preparation for the E clock high pulse.
Then in ES17, if /VMA is low, i.e. a /VPA cycle is ongoing, ETACK is asserted.
ETACK is analogous to /DTACK and signals the I/O controller to
terminate the /AS cycle in synchronization with the E clock going low.
This diagram shows the timing of two I/O bus cycles, first a 4-clock cycle terminated by /DTACK, then a longer cycle terminated by either /DTACK or /VPA.
The I/O bus master controller initiates a cycle when the IOREQ signal originating from the FSB domain (discussed subsequently)
is high and there is no ongoing bus cycle.
The IOS state counter tracks the progress through a M68k bus master transaction.
In IOS0, the bus is considered to be idle. In IOS0 if C8M is low and IOREQ is high,
then IOACT goes high and IOS1 is entered. Entrance into IOS1 is delayed by one clock if C8M is high.
IOS counts from 1-5 and then pauses in IOS5, only transitioning to IOS6 when C8M is high
and one of /DTACK, ETACK, /BERR, or /RESET are active.
For /DTACK, /BERR, and /RESET termination, the termination signals must be low not only at the rising edge concluding IOS5
but also at the previous falling edge and rising edge, otherwise cycle termination is held off.
In order to best match M68k's timing and meet the timing constraints of BBU, /AS is output on the falling edge of C16M.
/AS is active following the falling edge in the middle of IOS1 until the falling edge in the middle of IOS6.
The timing for /LDS and /UDS is a similarly straightforward function of IOS, R/W, and the FSB /LDS and /UDS signals.
As mentioned before, IOS5 is maintained until C8M is high and one of the cycle termination signals is active.
Once this occurs, IOS6 is entered and IOACT goes low.
IOS6 transitions to IOS7 and then around to IOS0, which is maintained until another I/O request comes in.
It is the responsibility of the FSB controller to deassert IOREQ after IOACTV goes high
in order to prevent the bus transaction from occurring twice.
However, IOREQ can be maintained high through IOACT going high, low, then high again
in order to ensure two back-to-back bus transactions occur.
Notice the ADout0LE and DinLE signals.
ADoutLE is the latch enable for address and write data going from the FSB to the IOB.
DinLE is the latch enable for read data going from the IOB data to the FSB.
ADoutLE is high only during IOS0 and is low during IOS1-7.
Therefore address and write data are latched for the entirety of the bus cycle.
ADoutLE0 is additionally gated by the ADLEEN signal from the FSB clock domain.
DinLE is high following the falling edges in the middle of IOS4 and IOS5, thus the input latch captures the read data.
This diagram shows the timing of an I/O bus cycle beginning with C8M high.
This case is basically the same as the start of the previous, just the IOREQ comes in one C16M clock earlier.
Therefore although IOACTV goes high and ADoutLE goes low immediately following IOREQ detection,
entrance into IOS1 is delayed by one clock.
This diagram shows the synchronization of IOREQ from the FSB clock domain to the I/O clock domain.
Because the C16M clock speed is low and because latency between the FSB and IOB is critical,
a single-state synchronizer triggered on the C16M falling edge is used.
On XC9500XL and ispMACH4000, the metastability recovery time tMET is only a few nanoseconds
for MTBF in the trillions of years.
With 30ns between the falling and rising edges of C16M, a single-stage synchronizer is adequate.
Given this arrangement, in IOS0, the delay between the FSB sending IOREQ low and the IOB responding with IOACT high
is 1.5 C16M clock cycles plus one tSU and two tCO, or approximately 110 nanoseconds.
This diagram shows two consecutive posted writes to the I/O bus.
In order to enhance video performance, the ability to "post" up to two consecutive writes to the I/O bus is desirable.
Three such posted writes are shown here.
During IOB space write cycles, the Ready1 signal (input to the FSB controller)
is high when the FSB-to-IOB interface can accept a posted write.
Because three writes were performed consecutively here before the first had the opportunity to complete,
Ready1 goes low because the FSB-to-IOB FIFO is full and completion of the third write is delayed until the FIFO is not full.
This diagram shows two consecutive posted writes to video/sound RAM.
When writing to video/sound RAM, the data written must be written to the I/O bus
as well as shadowed in the accelerator's onboard RAM.
Therefore a DRAM write cycle occurs concurrently with an I/O bus write.
Here we have the case where the I/O bus FIFO starts out empty and then accepts two writes in four clock cycles each.
In this case, the acceptance of the posted write by the I/O bus slave port and the DRAM write occur simultaneously.
Of course, were the posted write FIFO full or the RAM in refresh, either unit could
delay completion of the /AS cycle via their respective Ready signals.
This diagram shows a read from the I/O bus.
From the perspective of the FSB controller, the case where data is read from the I/O bus is fairly simple.
The IOB slave port holds Ready1 low until the I/O bus transaction is completed.
This diagram shows the behavior of the I/O bus slave port controller under a single read/write request.
Here we are just showing the signals relevant to the I/O bus slave port controller rather than all of the M68k FSB signals.
IOSTART true when the I/O bus space is selected, BACT is low, and the I/O request has not been recognized this current BACT cycle.
has been submitted but not yet accepted by the I/O bus slave port controller.
If the posted write FIFO is empty then the IOB slave port controller can submit
a new access request to the master controller.
In this case the posted write FIFO is empty and IOA is active,
so the IOB slave controller enters PS2 and asserts IOREQ.
In addition, at this time, IORW0 is latched from the FSB's R/W line.
This tells the IOB master controller whether the current request a read or write.
At the end of the first PS2 state, ALEEN0 is lowered in order to latch the address and write data into the IOB interface latches.
IOLU0[1:0] is also latched from the FSB /LDS and /UDS signals.
Similar to IORW0, IOLU0 encodes which of the two bytes of the data bus are to be accessed by the IOB master controller.
ADLEEN0 merits some additional explanation.
Since the IOB slave controller supports a 4-clock posted write, following the first PS2 state of a posted write,
M68k will remove /AS and terminate the cycle.
Because of synchronization overhead between the FSB and IOB clock domains, the IOB master controller may not latch the address and write data into the latches between the FSB and IOB before the cycle terminates.
Therefore the ALE0 output is additionally gated by the FSB clock domain signal ADLEEN0.
ADLEEN0 stays low until a receipt of the IOB request is confirmed by the IOB master controller.
Following the first PS2 state, the IOB slave controller waits in PS2 until the IOB master controller signals IOACT,
indicating that it has received the IOB request.
Once IOACT is received high then the IOB slave controller removes IOREQ and ADLEEN0 and enters PS1.
In PS1, the IO bus controller waits for IOACT low, indicating that the cycle has completed, and then returns to PS0. Additionally, once IOACT is low, if IORW0 indicates a read was performed, IORDRDY is brought high for one cycle.
The actual Ready1 output signal is a combination of IORDRDY and IOWRRDY which selects the corect one depending on the address range accessed.
This diagram shows two posted writes. In this case, the posted writes are spaced out such that the FIFO is never fully utilized.
Here we have the case where two posted writes occur close enough in time that the FIFO is fully utilized.
Similar to the previous case but the writes are even closer in time.
Similar to the previous case (again) but here the second write has come in before the IOBS has received indication from the IOBM that the previous write has begun.
Similar to the previous case (again). This is the closest write timing allowed, even faster than MC68k can do.