01signal.com

The logic for starting off and resetting an FPGA properly

This is the third and last page in a series about resets in FPGAs. It's recommended to read the first two before this one.

Overview

Setting up proper logic for creating resets is well-spent time. Even if there's no apparent problem with the design as is, skipping this design phase is likely to bite back: You might spend days attempting to solve some instability problem, not realizing it's because the logic wasn't reset properly, eventually working around it with some ugly hack. More on this page regarding weird instabilities.

Thinking and designing the reset signals from the project's beginning is also important in order to assure that it grows properly as new functionalities are added. When an FPGA project is started from scratch, its core functionality is often implemented first, and features are then added as time passes by. As different modules often require separate clocks and resets, it's quite easy to fall in the trap of hacking together something quickly for each part separately, so the project becomes more and more chaotic as it progresses.

Having a central module for resets and clocks, that is properly written from day one, makes it easier to avoid this. And as further explained below, the reset controller and the clock resources (PLLs and clock buffers as necessary) influence each other, so putting them in the same module pays off. I usually name this module clkrst.v.

However it's often not possible to concentrate all of this in one module, as some IP cores, subsystems or design blocks may generate their own set of resets and clocks. When this is the case, it requires careful thinking about what depends on what, and how the overall system should react to reset requests that arrive from difference sources. For example, the reset to a PCIe block arrives almost always from the bus itself, and the block produces a reset signal for use by any logic that is connected to it. In situations like this it's necessary to consider, for example, how the system in general should respond, if at all, if a reset arrives from the PCIe bus.

Since each project has its own story, there's no single catch-all solution. This page describes concepts and suggests ideas and code snippets to compose your own reset controller from. That said, these aren't cut-and-paste code snippets for direct use, and should be treated as demonstrations.

For the sake of simplicity, I'll assume that no other block in the system generates neither clocks nor resets, however extending the controller to the general case is fairly straightforward.

The reset state machine

In most designs, there are two main scenarios that require resetting:

In addition, resetting may be required as watchdog timers expire and / or other means for detecting a major system fault goes off.

The expected response to all these scenarios is a fundamental restart, of the sort that ensures that no matter what went wrong, it's rectified. This is often best done with some kind of state machine, that ensures a repeatable and consistent reset sequence, no matter the reason it was triggered.

Having said that, there are scenarios where a local, possibly recurring reset makes sense. For example, logic that processes video image frames can be reset before the beginning of each frame. This typically calls for a lightweight mechanism, which boils down to assigning several registers with initial values when a local synchronous reset is asserted. This is a reset mechanism for all purposes, and is often a clean and simple way to ensure robust operation.

But as there's not much more to add about this possibility, the rest of this page focuses on the fundamental, FPGA global restart kind of reset.

Counter based state machine

Since the fundamental start or restart are fairly rare events, there's no significance if it takes a few microseconds more than necessary. In many cases even up to 100 ms or so is fine, and this can be taken advantage of. A plain counter is therefore a simple way to implement a sequence of events.

In its simplest form, it boils down to something like:

reg [4:0] reset_count;
reg       rst_src_pll, rst_src_shreg, rst_src_debounce;
reg       clear_counter;
reg       master_reset;

initial reset_count = 0;
initial master_reset = 1;
initial clear_counter = 1;

always @(posedge wakeup_clk)
  begin
    clear_counter <= rst_src_pll || rst_src_shreg || rst_src_debounce;

    master_reset <= (reset_count != 31);

    if (clear_counter)
      reset_count <= 0;
    else if (reset_count != 31)
      reset_count <= reset_count + 1;
  end

@rst_src_pll, @rst_src_shreg and @rst_src_debounce represent different reasons to reset the system, and are set by some other logic. I'll go through a few examples of such, but for now the important point is that they're clocked by @wakeup_clk (hence no clock domain crossing).

@clear_counter is the logic OR of these reasons for reset. This register zeroes @reset_count, which goes from 0 to 31 (in this example) and stops there.

Finally, @master_reset is asserted unless @reset_count has finished counting. This is the simplest form of a reset state machine, and counting to 31 is also pretty modest.

So if any of the @rst_src_N registers are asserted, even for a single clock cycle, the synchronous reset is asserted for 31 clock cycles.

There are two advantages with a long reset pulse: First, if the related @rst_src_N goes on and off randomly (wobbling PLL lock detectors, push buttons, software that issues resets multiple times), these wobbles don't propagate to the visible synchronous reset. While such wobbles are usually harmless, they can result in unnecessary toggling of output pins, which can have a negative effect, for example confusing a person testing the electronics into thinking something is wrong.

In that sense, the example of 31 clocks is quite minimalistic — it's even better to count to the value corresponding to 10-100 ms, if such delay is acceptable. By doing so, any wobbles that are shorter than so are contained by the reset controller.

The second reason for a long reset pulse is that the distribution of the master synchronous reset into local copies of it (as explained previously) involves one clock delay each time the reset tree is branched. A long reset pulse ensures that at some point in time, all logic is exposed to an asserted synchronous reset. It's still a bad idea to have uneven delays in the reset path, as the deassertion of the reset becomes uneven as well, but sometimes there's no problem with that. For this matter, the 31 clock pulse is way longer than probably necessary, however it doesn't hurt.

Resets for other clock domains

@master_reset is a regular synchronous reset, but the clock it's synchronized with may be different from the ones used by the application logic. To produce synchronous resets for other clocks, go something like this for each clock:

reg reset_clk_pre1, reset_clk_pre2;
reg reset reset_clk;

always @(posedge clk)
  begin
    reset_clk <= reset_clk_pre2;
    reset_clk_pre2 <= reset_clk_pre1;
    reset_clk_pre1 <= master_reset;
  end

This is just a regular clock domain crossing with three stages, which produces @reset_clk, the synchronous reset for the clock @clk. Two stages are actually enough, but since it's an important signal, I've indulged it with an extra register, to be extra safe.

The reset controller's clock

There are in principle three kinds of clocks that can be used as @wakeup_clk, i.e. for the reset state machine:

The first option is easiest to work with. Since there's almost always a reference clock that drives the FPGA's PLLs, this reference clock can often be used directly as the wakeup clock. It's however important to verify that the clock is indeed stable when the FPGA wakes up, i.e. that the FPGA's bitstream load time is longer than the time it takes the external oscillator to produce a valid clock. Datasheet-wise this is usually the case with a large margin, but if the board's powerup sequence isn't planned properly, it may very well be that the FPGA is given a go-ahead to read its bitstream long before the supply voltage of the oscillator has reached its target.

The second option is to use a clock that is generated by a PLL on the FPGA itself. The obvious advantage is that this clock is possibly used for the application logic as well, so it's more efficient with clock resources as well as power. This setting requires holding the reset state machine in its initial stage until the clock is legit, which can be done with something like

reg rst_src_pll;
reg rst_src_pll_pre;

initial rst_src_pll = 1;
initial rst_src_pll_pre = 1;

always @(posedge wakeup_clk)
  begin
    rst_src_pll <= rst_src_pll_pre;
    rst_src_pll_pre <= !pll_locked;
  end

where @pll_locked is the active-high lock detector output of the PLL that generates @wakeup_clk. Since the lock detector is an asynchronous signal, it's first synchronized with @wakeup_clk, and then used as one of the reasons for asserting @clear_counter, as defined above.

Another advantage of this option is that if the reference clock is temporarily unstable or absent (in particular after powering on the board), there's a good chance that the lock detector will be unstable as well. Hence if a long count until deasserting @master_reset is chosen (surely much higher than 31), there's a chance that the FPGA will remain firmly in reset until the reference clock is good to work with. This can't be depended upon however.

In any case, the fact that @wakeup_clk is the output of a PLL necessarily means that it clocks logic before it has stabilized. It's therefore not clear if the reset machinery functions during that time period. One possible answer is that it doesn't matter, because at some point the clock will become good enough, and proper resets will be generated then. If everything is left in the past after a reset, who cares that it messed up a bit at first?

A more rigorous approach is that the resets must be held steadily asserted until the wakeup clock is locked, so the FPGA doesn't behave weirdly soon after its configuration. This requires paying attention to implementing only very simple logic with it, of the sort that doesn't fail significantly if the clock runs faster than expected.

To analyze what happens if @wakeup_clk runs temporarily with a frequency too high, note that the only register in the reset state machine that is a vector is @reset_count. This means that all other registers' flip-flops may at worst sample their calculated next value one clock later because of timing violations. In particular, because @pll_locked is deasserted while the PLL is unlocked, @rst_src_pll soon becomes steadily asserted, and hence @clear_counter asserted steadily as well. When the D input (next value) of a flip-flop doesn't change, it doesn't matter how fast the clock is.

The only possible problem is hence with @reset_count, which might count up incorrectly until its calculated next value is steadily held at zero because of @clear_counter. For example, if its current value is 3 (binary 011), its next calculated value is 4 (binary 100), but if the two LSBs aren't sampled because of timing, and the third bit is sampled nevertheless, the counter's value can jump to 7 (111 binary) instead.

To prevent this from happening, the initial values of the chain of registers from @pll_locked to @clear_counter are all assigned in favor of holding @reset_count steadily at zero. Hence if @pll_locked remains steadily unasserted until the @wakeup_clk is correct (as it should), @reset_count won't move away from zero, and @master_reset remains firmly asserted.

As for the last option, to use the FPGA's ring oscillator for the reset controller: I haven't tried it myself, so I'm not sure how good idea it is. But if it saves someone who is stuck with no other option, here's more or less how to do it with Xilinx devices: Look up the Configuration User Guide for your device for a primitive called STARTUPE2 or something like that. It should have an output named CFGMCLK, which is a clock from an inaccurate ring oscillator on the FPGA itself. Its frequency is around 50-65 MHz. This clock is guaranteed to be stable when the FPGA wakes up, and I would set the timing constraint to a considerably higher frequency (say, 100 MHz).

But this is something I would do if there really was no other choice. For example, if the external reference clock isn't stable when the FPGA wakes up, so an extra delay needs to be implemented in logic.

Resetting PLLs

It's generally a good idea to reset the PLLs as part of the reset sequence. This ensures that they get reset when their reference clock is known to be valid. Also, if reset is launched by the user (e.g. pressing a reset pushbutton), this might be in response to a problem that stems from a poorly locked PLL. Shouldn't happen, but if it does.

@clear_counter can't be used to reset the PLL, because it's asserted when any PLL in unlocked. If it was used, the PLL would remain in reset state, would never get locked, and the reset would never be released. For the same reason, the PLLs' resets can't be derived from @reset_count: It's held at zero when any PLL is unlocked.

The solution is to create a separate reset register for the PLL, which is similar to @clear_counter. So with the notation from above that unlocked PLLs are reflected by @rst_src_pll, it boils down to something like this:

reg clear_counter;
reg reset_plls;

initial clear_counter = 1;
initial reset_plls = 1;

assign reset_sources = rst_src_shreg || rst_src_debounce;

always @(posedge wakeup_clk)
  begin
    clear_counter <= rst_src_pll || reset_sources;
    reset_plls <= reset_sources;

[ ... ]

In this code snippet, @rst_src_pll has been excluded from @reset_sources, and is used only for @clear_counter. As a result, the PLLs are reset along with the entire FPGA, except for because of themselves not being locked.

Note that if @reset_sources wobbles, so will @reset_plls as implemented above. This is usually harmless, since nothing bad happens to PLLs when their reset goes on and off like crazy. There's nevertheless a way to avoid this too, as explained next.

More complex startup sequences

The use of a plain counter (@reset_count) as the state variable makes it easy to implement more complicated startup sequences. For example, implementing proper asynchronous resets, which are asserted with the clocks gated, is just a matter of simple logic expressions that define the time slots when the clocks are turned off and when the resets are asserted.

This plain counter method is therefore a good starting point even for designs that appear to need a simple reset at the beginning of the design cycle: Should it turn out that a complex startup sequence is required, it's easy to expand the existing logic into one. Planning the startup sequence boils down to defining the time periods at which each phase in the sequence is in effect, and translate this into value ranges of @reset_count.

For example, if the design has two or more independent clocks, just synchronizing @master_reset into each clock domain results in the clock domains coming out of reset in effectively random order. This is usually not an issue, but if it is, each clock domain can have its scheduled reset release, based upon different values of @reset_count.

It's also possible to implement more than one counter, when the reset sequence involves waiting for certain conditions to be met. For example, if the reset sequence involves resetting PLLs, waiting for them to lock, and then continue the reset sequence, it makes sense to maintain one counter that is held on zero until all PLLs are locked. The second counter is held at zero until the first counter finishes counting.

Note that if a PLL goes out of lock, the PLLs themselves will not be reset, but everything that depends on them will. This may not be the desired behavior, as a PLL losing lock is a serious fault in most designs. To request a full reset in the event of a loss of lock, add @pll_restart as defined below to the signals that are ORed to produce @reset_sources:

assign pll_restart = rst_src_pll && !master_reset;

This simply says: If the PLL went out of lock after master reset was released, reset everything again, including the PLLs. For this to work, the reset count must be long enough to contain the lock detectors' possible wobbles. In other words, the reset count must be longer than the time the PLL's lock detector can be high even though it's not locked yet. This is not to be confused with how long it takes for the PLL to lock. Some lock detectors don't wobble at all, and these wobbles are most likely significantly shorter than the PLL's specified lock time.

The wakeup shift register

It may be a bit of an overkill, but I usually add a wakeup shift register of this sort in my designs:

reg [15:0]   wakeup_shift;
reg          rst_src_shreg;

initial rst_src_shreg = 1;
initial wakeup_shift = 0;

always @(posedge wakeup_clk)
  begin
    rst_src_shreg <= !wakeup_shift[15];
    wakeup_shift <= { wakeup_shift, 1'b1 };
  end  

Since most FPGAs implement @wakeup_shift as a hardware shift register, which consumes the equivalent of a LUT, it's cheap in resources, and offers another mechanism to make sure a reset takes place during powerup. This is required in a design without PLLs, as nothing else will assert @clear_counter, but even if there are PLLs, there's a possibility that they will already be locked when the FPGA wakes up, as this is a possible configuration option on some FPGAs.

So in any case, this is a recommended better-safe-than-sorry piece of logic.

External reset button

Reset buttons are quite common. What they do exactly differs from one design to another. One possibility is that the button that the user considers "reset" is connected to the FPGA pin that kicks off the reload of the bitstream into the FPGA. It may also be connected to a processor's reset pin, if such is present.

And it could be connected to just a general I/O pin on the FPGA, for the purpose of resetting the FPGA logic. In this case, it's yet another reason to assert @clear_counter.

If @reset_count counts to a value that corresponds to 10 ms or more, there's no need to debounce the signal from the input pin, as the wobbles will be absorbed by the long count. In this case, this is enough:

reg rst_src_debounce;
reg rst_src_debounce_pre;

initial rst_src_debounce = 1;
initial rst_src_debounce_pre = 1;

always @(posedge wakeup_clk)
  begin
    rst_src_debounce <= rst_src_debounce_pre;
    rst_src_debounce_pre <= reset_button_pin;
  end

However if the count is short (as shown in the example above, reaching just 31), the pushbutton input needs debouncing. There are several ways to implement this, for example:

reg [17:0] debounce_count = 0;
reg        reset_button_d, reset_button_d2, reset_button_d3;

wire       debounce_reached = (debounce_count == 250000);

initial debounce_count = 0;
initial rst_src_debounce = 1;

always @(posedge wakeup_clk)
  begin
    reset_button_d3 <= reset_button_d2;
    reset_button_d2 <= reset_button_d;
    reset_button_d <= reset_button_pin;
	
    if (reset_button_d2 != reset_button_d3)
      debounce_count <= 0;
    else if (!debounce_reached)
      debounce_count <= debounce_count + 1;

    if (debounce_reached)
      rst_src_debounce <= reset_button_d3;
  end

This is a relatively strict debouncer, which doesn't play ball with noisy input signals — which is an advantage or disadvantage, depending on your needs.

With a 25 MHz clock, the value of @reset_button_pin is copied into @rst_src_debounce if the former has been stable for 10 ms. Note that @debounce_count is zeroed on the same clock that @reset_button_d3 changes value, so its value is never copied into @rst_src_debounce immediately when it changes, but only when it has been stable long enough.

Summary

There are many things to consider when designing the initialization of an FPGA. This page has presented some concepts and ideas, but it's important to keep in mind that the real task is to correctly recognize the events that need to trigger action. It's also important to define the correct response to each such event in order to robustly bring the logic to a working state.

Copyright © 2021-2022. All rights reserved. (42e6e8c4)