01signal.com

FPGA FIFOs: Different features and variants

Scope

This page is the second of four inĀ a series about FIFOs. After presenting the basics of a FIFO in the previous page, it's time to discuss common variants and extra features. FIFOs commonly employ more than one of these, so they are by no means mutually exclusive.

Single clock FIFOs

Even though the "baseline" FIFO has two independent clock inputs, it often so happens that the logic at its both sides is clocked with the same clock. It's perfectly fine to connect this same clock to @wr_clk and @rd_clk, but then the FIFO contains quite some logic for crossing clock domains that is unnecessary.

So every FPGA vendor offers two categories: Dual-clock and single-clock FIFOs. Other names are Independent vs. Common Clock FIFOs or Asynchronous vs. Synchronous FIFOs. The "baseline" FIFO presented on the previous page is a dual-clock one.

Single-clock FIFOs don't have any synchronization logic, as everything is running on the same clock. Hence the reset input is synchronous as well, so it has to be clocked by the same clock as the other ports.

Except for saving some FPGA logic, the other good reason to use single clock FIFOs is for clarity. It's a way to state loud and clear that there's no intention to have two clocks involved.

In short: If the FIFO doesn't cross clock domains, go for a single-clock FIFO.

FWFT FIFOs

As emphasized in the previous page, the procedure for reading data from a "baseline" FIFO is to drive @rd_en high and obtain the value on the FIFO's @dout output on the following clock cycle. That's somewhat counterintuitive: If the data is already in the FIFO, why do I have to ask for it? Why can't the FIFO just put it on the @dout port, and tell me that it's fine to use it?

So there's a common FIFO variant doing exactly that, and it's called a First Word Fall Through (FWFT, sometimes also called read-ahead, show-ahead or look-ahead) FIFO. A non-FWFT FIFO is often referred to as a "Standard FIFO" (can someone show me the standard?).

The idea is simple: When an FWFT FIFO goes from empty to non-empty, it presents the first word on @dout. The application logic then reads subsequent words by asserting @rd_en. The change is hence just regarding the first word.

It's however easier to understand an FWFT FIFO when realizing that the meaning of two of its ports has changed: @rd_en on a FWFT FIFO actually means "I have just consumed the data on @dout, it's fine bring the next one" and @empty actually means "@dout is not valid".

What hasn't changed, is that @rd_en should not be high if @empty is high. You can't say that you've consumed invalid data. So the rule remains the same, for a different reason.

The following waveform shows what reading from a FWFT FIFO can look like:

Example waveform for reading words from a FWFT FIFO

Note that the first valid value at @dout appears while @rd_en is low, and that @empty goes low along with that. As I just mentioned, @empty means "@dout not valid" on an FWFT FIFO, and the waveform reflects that.

Also note that the first @rd_en pulse didn't read a new value from the FIFO, but rather caused @empty to go high again, and the value of @dout became unknown again. In reality, @dout usually remains the same as @empty rises, but you can't rely on that.

After this, the FIFO once again puts a value on @dout and pulls @empty low. The application reads three words, and then drives @rd_en low. All in all, the application logic consumed four words from the FIFO. It might have used the fifth word D4 as well. Even if it did so, it didn't allow the FIFO to go on to the next one (by keeping @rd_en low).

On a different note, the following bare-bone code calculates the cumulative sum of everything that comes out of the FIFO:

assign rd_en = !empty; // If @dout's value is valid, it's consumed.

always @(posedge rd_clk)
if (!empty) // FIFO is FWFT, so !empty means @dout contains valid data
sum <= sum + dout; // Don't try this at home: @sum is never reset.

I'd just like to wrap up the FWFT topic with pointing out that the difference between the "standard" and FWFT FIFO echoes a fundamental issue regarding the data flow between any two pieces of logic: Does the receiving side need to ask for the data, or does the sending part present it as soon as possible, and the receiving part only confirms it's OK to go on? Always ask yourself this question when module X passes data to module Y, and in particular ask yourself if these modules are on the same page on this matter.

Asymmetric FIFOs

It's commonly allowed to define the FIFO with different widths for the read and write ports. This is useful, for example, if data arrives to the FPGA in 32-bit words, but the application logic processes them as bytes, that is 8 bits per word. In this case, set the write width to 32 bit and the read width to 8 bits. Both sides behave completely as usual, except that is so happens that it takes for read cycles to consume a word that was inserted with a single write clock cycle.

When the read width is larger than the write width, it behaves as one would expect: The data written to the FIFO isn't available at the read side until it has filled a full read word.

As for the order of packing the words, I've only seen FIFOs doing Little Endian. In other words, for the 32 bit to 8 bit FIFO just suggested, the first read fetches bits [7:0], then [15:8], [23:16] and [31:24]. But check the documentation to be sure.

And of course, 8 and 32 bits was just an example.

A few words on the the combinatoric dependency on @empty and @full

The @empty and @full ports have a similar drawback: The application logic has to responds to them on the same clock signal. In other words, @rd_en must be a combinatoric function of @empty in order to ensure they aren't high on the same clock cycle (which is forbidden, as already mentioned). By the same coin, @wr_en must be a combinatoric function of @full.

Combinatoric relationships of this sort may become an obstacle in meeting timing constraints, when the clock frequency is high (relative to the targeted FPGA). The main reason is that both @rd_en and @wr_en are often used in the logic that produces or consumes the data, in particular as a clock enable: If the FIFO stalls the data flow, so should the pipeline.

Well, to be completely accurate, there's a way to avoid that combinatoric relationship. For example, suppose @wr_en is declared as a register, and @want_to_write is some signal that represents the application logic's need to write at a given clock cycle. One can go:

always @(posedge wr_clk)
wr_en <= want_to_write && !wr_en && !full;

This ensures that @wr_en and @full are never asserted on the same clock cycle, because @full can turn high only on the clock cycle following the one @wr_en was high. The !wr_en component in the expression ensures @wr_en is never high for two consecutive clocks cycles, so if @full goes high, @wr_en will go low because of !wr_en on the first clock cycle, and remain low on the next one because of @full itself.

But this comes with the penalty of 50% utilization of the FIFO's write bandwidth, as @wr_en is forced low half the clock cycles. This is usually unacceptable.

The same reasoning goes for @rd_en.

This discussion was intended to lead to the next section: The "almost" ports.

Almost full, almost empty and friends

It's often possible to request the @almost_full and/or @almost_empty ports from the software that configures the FIFO. Doing so adds ports with these or similar names, that are added to the FIFO's instantiation port list.

@almost_empty is clocked by @rd_clk, and is high when there's one word or none to read from the FIFO. Likewise, @almost_full is clocked with @wr_clk, and is high when there's room to write just one word to the FIFO, or it's full.

How does this help? Well, because this is perfectly fine:

always @(posedge wr_clk)
wr_en <= want_to_write && !almost_full;

No combinatoric relationship, and no need to skip 50% of the write cycles. When @almost_full goes high, @wr_en may not go low on the same clock cycle, in which case there will be one write operation after that. But that's fine, as there's one write slot left.

Note that if @want_to_write is held high continuously as the FIFO gets filled, the last write fills it completely. Otherwise it's possible that it gets, well, almost filled: If @wr_en volunteers to be low instead of filling the last word because of @want_to_write, it will not get that chance again until the FIFO is emptied on the other side. @almost_full will remain high with one less than maximum data words in it.

That rarely matters, but for the sake of discussion, this ensures that the last word is used:

always @(posedge wr_clk)
wr_en <= want_to_write && (!almost_full || (!full && !wr_en));

However I seriously doubt if this is useful anywhere.

The story with @almost_empty is similar, so this is OK (but don't copy this for your code):

always @(posedge rd_clk)
rd_en <= want_to_read && !almost_empty;

As with @almost_full, there's this issue with the last word: If @rd_en volunteers not to read the last word in a FIFO because of @want_to_read, it loses the chance until the FIFO gets filled with more data. Unlike the @almost_full case, this can definitely be a big deal in some scenarios, because it means that there's data in the FIFO that was intended for reading, but remains stuck there.

So this is actually the safe way to go:

always @(posedge rd_clk)
rd_en <= want_to_read && (!almost_empty || (!empty && !rd_en));

Fill counters

Application logic often performs operations in chunks. For example, logic that transmits packets of fixed length of data across some physical wires. If the data is stored in a FIFO, the application logic needs to know that there's enough data to fill a packet before it starts reading.

Likewise, application logic often produces a fixed amount of data for storage in a FIFO, such as reading a burst of data from external memory. The operation shouldn't start unless there's enough room in the FIFO to complete it.

For these purposes, FIFOs usually support fill counters and/or programmable empty and full ports. The fill counters (sometimes called data counters) come in different forms and shapes, much depending on the FPGA vendor, so be sure to read the FIFO's documentation carefully. There are three main issues to pay attention to:

And then there's programmable empty and full, which are an extended version of @almost_empty and @almost_full. The idea is that since the use of fill counters is almost certainly something like

assign dont_start_reading = (rd_data_count < 64);

why not offer that signal directly, and call it prog_empty? Once again, read the FIFO's documentation carefully.

In particular, when it's important to read the last word in the FIFO, be sure to ask yourself if your logic will indeed do so, with a line of thought similar to the discussion on @almost_empty.

Almost needless to say, you'll have to request these extra ports when configuring the FIFO, if you want them.

AXI interface

This topic isn't directly related, however it's worth mentioning to avoid confusion, as this term often appears in context of FIFOs.

AXI is set of interfaces defined in the AMBA standard, which was introduced by ARM. As one might expect, FIFOs with AXI interface are intended to work as part of a processor setup.

The interface of the "baseline" FIFO (as I chose to call it) is often referred to as "native" interface, to differentiate it from AXI.

There are two main AXI interfaces: The "regular" AXI (typically AXI3, AXI4 or a AXI Lite) which is an address / data bus, and AXI-S, (streamed AXI), which is intended for streams of data (possibly divided into packets).

When a FIFO is configured as a AXI3 / AXI4 or AXI Lite module, extra logic is added to it so it can be connected as an address / data peripheral to a processor through this interface. I won't elaborate on this further, because it's a completely different business.

But because the streaming interface is somewhat similar to the behavior of a FIFO, it's possible to convert the handshake signals of the AXI-S interface into a "native" one. Note that AXI-S often involves other signals that need tending to as well.

So given the AXI-S signals for writing to the FIFO as @axi_w_valid, @axi_w_ready and @axi_w_data, they can be connected to a "standard" FIFO's ports with

assign axi_w_ready = !full;
assign wr_en = axi_w_valid && axi_w_ready;
assign din = axi_w_data;

Likewise, the AXI-S signals for reading from the FIFO, @axi_r_valid, @axi_r_ready and @axi_r_data, can be connected to a "standard" FWFT FIFO's ports with

assign axi_r_valid = !empty; // Non-empty means valid with FWFT FIFOs
assign rd_en = axi_r_valid && axi_r_ready;
assign axi_r_data = dout;

Once again, note that for this to work, the FIFO must be a FWFT variant.

This wraps up the second page in this series on FIFOs. The next page shows how a single-clock FIFO is implemented in Verilog.

Copyright © 2021-2022. All rights reserved. (42e6e8c4)