This page is the first of four in a series about FIFOs.
The FPGA FIFO is a memory element with a simple concept: One piece of application logic writes data words on one side of the FIFO. On its other side, another piece of application logic reads these words from the FIFO, in the same order (FIFO = First In First Out).
This data is stored inside the FIFO. How many words of data it's capable of storing is referred to as the FIFO's depth, which is a configurable parameter. And so is the width of each word (i.e. the number of bits per word).
FIFOs are maybe the most commonly used IP in FPGA designs. Whenever one piece of logic generates data, and another consumes it, the immediate solution that comes to mind is putting a FIFO between them (not saying it's always the right one, of course...).
For those familiar with UNIX / Linux command-line interface, the use of FIFOs can be compared with pipes between commands: One program's output becomes the other one's input, and some magic machinery between them takes care of the rest.
Because of their ubiquitous use, there's a de-facto agreement on how an FPGA FIFO should behave. Every FPGA development software offers a way to generate a FIFO IP module for use by the application design. Not only that, this FIFO module is very likely to have a certain set of ports that behave just like any other FPGA FIFO.
The FPGA vendor's software allows generating FIFOs that fit your specific needs. It's just a matter of configuring its attributes in some GUI tool (width, depth and other attributes I'll discuss), and the tools take care of the rest. All that is left to you is to instantiate the module in your design. Unlike a lot of other things in the FPGA world, this one is really that simple.
As each FPGA vendor presents its own FIFO IP, it's of course important to read the documentation down to the fine print. That said, odds are that the default settings correspond to what I'll refer to as a "baseline FIFO". The terms used by different FPGA vendors differ somewhat, and so do the FIFO module's port names. Also, each vendor offers a slightly different set of extra features and configuration options. But there's definitely a set of base features that are always available.
However the implementation of the FIFO in the logic fabric differs from one vendor to another, so understanding the meaning of the FIFO's attributes is important to make a good use of the FPGA's resources.
All in all, understanding FPGA FIFOs is a one-time effort. Once you've got the hang of it on one FPGA, the rest will follow. Which is by itself a reason for their ubiquity.
The baseline FIFO
There is no written standard for FPGA FIFOs, but there's a wide agreement on their merits nevertheless.
All FIFOs have two interfaces, one for writing words and one for reading words. Let's look at the instantiation of what I'll call the "baseline FIFO". It has important variations, which I'll get to later.
myfifo myfifo_ins ( .rst(rst), // Asynchronous reset input // Write interface ports .wr_clk(wr_clk), // Write clock input .wr_en(wr_en), // Write Enable input .din(din), // Write word input .full(full), // Full output // Read interface ports .rd_clk(rd_clk), // Read Clock input .rd_en(rd_en), // Read Enable input .dout(dout), // Read word output .empty(empty) // Empty output );
The names of the ports are those common with Xilinx' tools, but other FPGA vendors use similar names.
The FIFO module's ports are divided into three groups: A reset signal (@rst), which I shall get back to later, and as expected, a write interface and a read interface, each consisting of four ports.
The @din and @dout ports are vectors, carrying the data words going in and out from the FIFO. How wide these words are is something you decide when setting up the FIFO, using the relevant software tool. You'll also set up the depth of the FIFO, i.e. how many words it can contain. These two parameters influence how much FPGA memory resources the implementation of the FIFO will consume.
It's worth noting that each of these two interfaces has its own clock, @wr_clk and @rd_clk. Each of these is the clock that drives the other ports the interface it's related to.
This gives rise to one the most common uses of FIFOs: Crossing clock domains. If some logic in your design is clocked by clk_A, and another part is clocked by clk_B, how do you make them work together? The first thought of any FPGA engineer is to put a FIFO between them. This is mainly because crossing clock domains is a major headache, and using a FIFO solves the problem easily and watertight.
The write interface
The write interface is simple: @wr_clk, @wr_en and @din are inputs to the FIFO, and @full is an output.
When @wr_en is high on the rising edge of @wr_clk, the data in @din is pushed into the FIFO. The @full port is high when the FIFO is full.
For example, this is a waveform of writing five words to the FIFO:
In this waveform the application logic first writes the words D0 and D1. The FIFO raises its @full output to inform that it became full after the successful write of D1, to which the application logic responds to by lowering @wr_en during the same clock cycle. After a couple of clock cycles, @full is brought low (by the FIFO) to indicate that it's fine to write to it again, most likely due to activity on the reading side.
The application logic could have begun writing on the same clock cycle that @full went low on, however it begins doing so slightly later (in this specific example), writing three additional words.
In the waveform, where @din is marked with the "Dx" value, it means that the value is ignored and therefore doesn't matter. For example, in the "Dx" segment between D1 and D2, @din could have remained on D1, switched to D2 earlier than shown, or something completely different. The result would have been the same.
For a simple coding example, suppose I want to fill the FIFO with words that count up whenever possible:
assign wr_en = !full;
always @(posedge wr_clk)
din <= din + 1;
This exemplifies the correct relation between @full and @wr_en: If @full is high, @wr_en must not high on the same clock cycle. And what if it is? What if we ignore the @full signal? Odds are that the FIFO will ignore the wr_en in this case, as if it wrote only if @the_real_wr_en, defined as followed, was high:
assign the_real_wr_en = wr_en && !full;
However some FPGA tools allow configuring the FIFOs without this safety mechanism, in which case virtually anything can happen if the FIFO is written to, despite being full.
This way or another, @full should be respected, or else it will appear as data has leaked away. Consider the example above: Had @wr_en been held high all the time, @din would have kept on incrementing whether the data was written to the FIFO or not. So when reading the data at the other end, the counting up would have rendered discontinuous.
Note that @full can go from low to high only as a result of a write cycle, i.e. following a rising clock edge with @wr_en driven high. Except for when the FIFO is reset, as discussed further below.
The read interface
The read interface is quite similar, but not exactly the same. @rd_clk and @rd_en are inputs to the FIFO, @dout and @empty are outputs.
When @rd_en is high on the rising edge of @rd_clk, a new word is read from the FIFO's memory, and @dout is updated with its value following that rising edge, i.e. on the next clock cycle. The @empty port is high when the FIFO is empty.
In this example waveform five words are read from FIFO:
In this waveform the application logic begins with reading three words. In response to @empty going high along with the appearance of D2, the application logic brings @rd_en low in the same clock cycle. As before, it could have returned @rd_en back to high on the same clock cycle that @empty went low (due to data being written into the FIFO on the other end), but instead it waited for a few clock cycles, and then read two additional words.
Picky note alert: If you compare this waveform with the one above it, you may notice that five words were written, and five words were read. So why didn't @empty go high along with the the appearance of D4? Well, because I wanted to show that one can stop reading even if the FIFO isn't empty. So for the sake of this imaginary example, there were additional words written to the FIFO, hence the FIFO didn't get empty after reading D4.
Note that @dout retains its value when it doesn't change as a result of a @rd_en cycle. The application logic may rely on this: Except for after the FIFO has been reset, and until the first read cycle, @dout always contains the value of the last fetched word.
Even more important, note that the new value of @dout appears one clock after the clock cycle @rd_en is high. As if the Verilog code of the FIFO said
always @(posedge rd_clk)
if (rd_en && !empty)
dout <= next_word_to_show;
This bogus snippet also expresses the fact that most FIFOs ignore @rd_en if @empty happens to be high at the same clock cycle. As with the write interface, @rd_en should not be high if @empty is high on that clock cycle. Once gain, sometimes the FIFO can be configured not to have this protection mechanism, so don't break this rule.
@empty can go from low to high only following a read cycle, i.e. when the clock is rising with @rd_en high. Except for when the FIFO is reset.
To give an example, here's bare-bone Verilog code (that is, without reset) that reads words from the FIFO, and calculates the cumulative sum of those.
assign rd_en = want_to_read_now && !empty;
always @(posedge rd_clk)
rd_en_d <= rd_en;
sum <= sum + dout; // Don't try this at home: @sum is never reset.
For the sake of demonstration, I've added a @want_to_read_now signal, which indicates that the logic wants to read. @rd_en is nevertheless high only if the FIFO isn't empty.
Note that @rd_en is delayed by one clock cycle into @rd_en_d, which is high at the same time as when there's a new and valid value in @dout. Accordingly, @rd_en_d is used as the condition for using @dout's value. This demonstrates the slight trickiness that the one-clock delay between @rd_en and @dout causes.
Synchronization and latency
Because I drew the example waveforms above separately for write and read, they miss an important point: It takes a few clock cycles from writing the first word into an empty FIFO until the @empty port goes low. Likewise, it takes a few clock cycles from reading the first word from an full FIFO until the @full port goes low.
This happens because the information about writing into the FIFO needs to cross clock domains before reaching the logic of the FIFO's reading half. Clock domains are crossed with synchronization logic, which take a few clock cycles across. Hence the @empty port responds slightly later. And vice versa with the @full port.
How many clock cycles is this delay? It depends on a lot of things, among others the relation between the two involved clocks' edges at the specific moment. In short, it's difficult to tell.
Among the things that do affect this delay is the number of synchronization stages, which is often a parameter one can set for the FIFO. Two stages is a common choice, but a larger number can be selected for those believing it will increase the FIFO's reliability, at the expense of a some extra logic resources and increased latency of the @empty and @full ports, as just discussed.
So if you really feel like indulging your FIFO, increase the synchronization stages to three, to feel absolutely super safe.
The reset input
All FPGA FIFOs have a reset signal. Since the FIFO is driven by two clocks, this reset signal isn't expected to be clocked with any of them — it's asynchronous. The FIFO's internal logic makes sure to synchronize the reset internally for each of the two clock domains.
So what does the reset do? Well, to being with it empties the FIFO and sets @empty high. If there was any data in the FIFO, it's lost.
As for the @full output, it's common (and recommended) that FIFOs drive this high as well on reset, until the FIFO's write logic is ready to receive data. However this should be checked against the FIFO's docs, as this behavior can be optional. After all, the FIFO isn't full after reset. And pulling @full high on reset breaks this other rule, saying that this port rises only following a write cycle.
It's important to be aware that it takes a few clocks cycles between the assertion of the reset signal, and the @empty and @full ports going high. This is because of the FIFO's synchronization logic. So things are a bit fuzzy during the few clock cycles around asserting the reset. Make sure that the application logic doesn't attempt to neither write nor read from the FIFO during the few clock cycles around resetting the FIFO.
Even though the reset signal is asynchronous, it should be driven by the output of some register (flip-flop) of the FPGA, and not through combinatoric logic, as glitches of the latter might trigger it off undesirably.
Actually, many FPGA engineers incorrectly assume that connecting virtually anything to the reset port will work. The FPGA vendor might however have unexpected specifications on the reset signal. For example, this is taken from Xilinx' Product Guide for its FIFO (PG057):
If the asynchronous reset is one slowest clock wide and the assertion happens very close to the rising edge of slowest clock, then the reset detection may not happen properly causing unexpected behavior. To avoid such situations, it is always recommended to have the asynchronous reset asserted for at least 3 [ ... ] slowest clock cycles...
(Chapter 3, "Resets")
So by all means, read your vendor's FIFO user guide on how to properly generate this reset signal.
The underlying implementation
Even though the vendor's software tools take care of everything to make the FIFO operate properly, it's a good idea to be aware of which FPGA resources are utilized, in particular for avoiding shortage of some resource type.
Each FPGA has its options, but I'll briefly mention a few common:
- Fully hardware implemented. This usually means that a block RAM is used for storage, and on top of that, the logic that implements the FIFO is implemented in silicon rather than in the logic fabric. This doesn't save so much logic, but implementation in silicon is probably better for high frequencies. The main drawback is that the feature set of such FIFO is limited to what is implemented in silicon, so the FIFO may be relatively limited in size and trivial features may be missing.
- Block RAM FIFOs. This is the most common sort. The FIFO consists of as many block RAMs as needed to obtain the FIFO's width and depth, along with the logic implemented in logic fabric.
- Distributed RAM FIFOs. This is like Block RAM FIFOs, but logic slices as used as RAM instead of block RAMs. Recall that most FPGAs have the capability of using the slice's LUTs as RAMs, so this is an economic option in particular when the FIFO is shallow. I'd say, 32 words or less, but the tradeoff depends on the design and FPGA family.
- Shift-register based FIFOs. This is the exotic version of distributed RAM FIFOs: Because slices can also behave as shift registers, it's possible to save some logic by taking advantage of this.