This is the third post in a series of four on Partial Reconfiguration, or Dynamic Function eXchange (DFX) with Xilinx' Vivado. While the previous two show how to set up the FPGA design, this one discusses how the static logic may cope with the reconfigurable logic vanishing and reappearing.
The process of loading a partial bitstream is similar to hot swapping of physical hardware: A certain part is removed abruptly, replaced by another, and powered on. This post discusses the means necessary to ensure that this transition is made smoothly and reliably.
This post assumes that the Partial Reconfiguration is carried out by static logic on the FPGA itself. To ensure smooth operation, it should be carried out with these stages (explanations follow):
- Bring the reconfigurable logic to a safe state, so resetting it doesn't confuse anything affected by its outputs.
- Assert the reset signal(s) for the reconfigurable logic.
- Initiate decoupling.
- Load the partial bitstream into the ICAP.
- Wait for the configuration's STARTUP sequence to finish.
- Disable decoupling.
- Deassert the reset signal(s) for the reconfigurable logic.
The need for decoupling
During the process of loading the partial bitstream, the connections between the static and reconfigurable logic dangle. As a result, the reconfigurable module's output ports may generate random patterns, or rest at otherwise illegal combinations. This doesn't necessarily occur on every port all the time, but odds are that some kind of odd behavior will be seen.
As the static logic continues to function regardless of loading the partial bitstream, it's necessary to make sure that the random signals that (may) arrive from the reconfigurable module are ignored, to avoid any adverse effects. Xilinx' user guide, UG909, refers to this as decoupling.
How to implement decoupling depends on the nature of the reconfigurable module's output ports. To get this right, the influence of each of these ports on the static logic should be analyzed, aiming for corrective actions as necessary. This could for example involve multiplexing the signals with neutral values during the reconfiguration, adding clock-enable signals to logic that should not respond, or holding some parts of the static logic in reset.
If reconfigurable logic is connected directly to I/O pads, it might be necessary to put these in high-Z mode or deassert the I/O logic's clock enable (if that I/O logic involves an output register). Also, if the reconfigurable logic is connected to external components, it may be necessary to bring these components to a consistent state before detaching.
A Partial Reconfiguration Decoupler IP is available in Vivado's IP Catalog for decoupling AXI connections.
The reconfigurable module's input ports need no such treatment — the static logic doesn't care that the signals it drives are not consumed.
Regarding Ultrascale devices, decoupling is required before loading "clearing bitstreams", since they effectively shut down the reconfigurable logic.
A few words on the STARTUP sequence
One important part in the bitstream (full as well as partial) is the START configuration command, which kicks off the STARTUP sequence. This sequence involves a couple of mechanisms that bring up the logic in a fairly consistent way.
The first is that the synchronous elements are assigned their default values at the end of the configuration. This is always true for Ultrascale FPGAs and later. On Series-7 FPGA, it's true for a full configuration, and for Partial Reconfiguration if RESET_AFTER_RECONFIG is set.
Then second is GWE: Global Write Enable (not to be confused with synchronous elements' enable or write-enable inputs). This signal allows flip-flops and RAMs to change values. It's held low globally during full configuration, and is asserted at some stage during the configuration startup sequence. When loading a partial bitstream, only the reconfigured logic is affected.
Naturally, the assertion of GWE is asynchronous with respect to any user-provided clock, so the timing between this assertion and the first valid clock can't be assured on any synchronous element.
In a Partial Reconfiguration scenario, just like with a full configuration, this means that all synchronous elements will have their wakeup value immediately after the process has finished (assuming Ultrascale and later, or RESET_AFTER_RECONFIG is set) however there is a random possibility that some synchronous elements will responds to the first user clock cycle, and others won't, depending on when this first clock arrives relative to the assertion of GWE. It's therefore important to properly reset logic that is sensitive to such disparity.
The need to reset reconfigurable logic after loading the partial bitstream is the same as after a full configuration of the FPGA. It's however more intuitive that resetting is required in the latter case, in particular because the reset is often held until some elements have stabilized (MMCMs or PLLs locked, external hardware ready etc.).
The summarize, there is no single answer on whether to reset the reconfigurable logic, and which parts of it. Just like a full configuration, the synchronous elements are assigned their default values and begin responding to clocks simultaneously. This is good enough in some situations, and a user-defined reset is required in others.
Detecting the end of STARTUP
In the stages listed above, the only part that is out of the user logic's control is the STARTUP sequence. It's nevertheless important to know when it's finished.
There's a description of the STARTUP sequence in the Configuration Guide for each FPGA family, but to make a long story short, the time this sequence takes depends very much on the bitstream options. For example, the sequence can be configured to wait for MMCMs to lock or DCIs to match.
The FPGA's supplies a signal, End Of Startup (EOS), which goes high (logic '1') at the last stage of the STARTUP sequence, or simply put, with this sequence is finished. Relying on EOS is the formally correct way to tell when to initiate reset and recoupling of the reconfigurable logic.
The EOS signal is available only from inside the logic fabric, by instantiating a STARTUPE2 primitive, possibly as follows:
wire eos; STARTUPE2 #(.PROG_USR("FALSE")) startup_ins ( .CLK(1'b0), .GSR(1'b0), .GTS(1'b0), .KEYCLEARB(1'b1), .PACK(1'b0), .USRCCLKO(1'b0), .USRCCLKTS(1'b0), .USRDONEO(1'b1), .USRDONETS(1'b1), .CFGCLK(), .CFGMCLK(), .EOS(eos), .PREQ());
So when the bitstream has finished loading into the ICAP, wait for EOS to become high, and then begin reset and recoupling.
Ultrascale FPGAs have a STARTUPE3 primitive instead, however Vivado accepts STARTUPE2 primitives for these devices, and translates them correctly into STARTUPE3. So the code snippet above covers all FPGA families.
In an anecdotal test with Kintex-7, with default bitstream settings, it took 26 clock cycles (at 100 MHz, hence ~260ns) from the arrival of START command in the bitstream to the ICAP to the assertion of EOS. Since there were additional data in the bitstream, among others NOPs, it's quite possible that EOS asserted very close to the last word of the bitstream being fed into the ICAP.
Another anecdotal test with a Kintex Ultrascale device yielded completely different results: It randomly took EOS somewhere between 0.8 ms to 4.5 ms to go high after the START command.
Even though it's quite easy to instantiate the STARTUPE2 primitive as shown above, it's also possible to kick off the recoupling sequence after a fixed time after the bitstream finishes loading. For example, it's quite unthinkable that the STARTUP sequence would take as long as 100 ms, and yet it's practically unnoticed delay for a human.
As for Ultrascale devices, EOS' behavior after loading a clearing bitstream is apparently not documented, but in an anecdotal test it remained low after loading the clearing bitstream, and went high after the partial bitstream that followed.
The third post ends here. The last post goes behind the scenes of how Vivado handles the relationship between the static and reconfigurable logic by virtue of OOCs and DCPs, and how understanding this opens for a reliable way to produce partial bitstreams for the Remote Update scenario.