01signal: Flow control: Methods and protocols for use with MGTs

This page is the eighth and last page in a series of pages introducing the Multi-Gigabit Transceiver (MGT).

Introduction

Flow control is a mechanism often used in different types of data links. This mechanism's purpose is to prevent a situation where the amount of transmitted data is more than what the receiving side can handle.

Protocols related to MGTs offer different levels of support for this feature. For example, logic that implements the PCIe protocol never accepts a packet for transmission if the buffers on the recipient's side are full. The application logic that uses the PCIe protocol doesn't need to implement anything in relation to this. Rather, the interface between the application logic and the PCIe protocol logic allows the protocol to temporarily refuse to accept new data (for example, with the help of AXI-S ports such as VALID and READY). By the same token, the application logic is allowed to temporarily refuse to accept data arriving from the other side. By doing so, the application logic protects itself from overflow.

It's important not to confuse flow control with the mechanisms used to prevent overflow in the MGT's own elastic buffer. Flow control has nothing to do with the skip symbols mentioned on the page about the MGT's own buffers. Rather, flow control protects the application logic's buffers. And as there are often several data channels that use the MGT as a shared resource, the flow control is enforced for each such channel separately and independently.

Generally speaking, protocols that define communication with a computer take care of the flow control mechanism. The application logic only needs to interface with the protocol logic with the help of simple handshake ports, just like with many other types of logic blocks. In addition to PCIe, SuperSpeed USB and SATA take care of the flow control by themselves.

Unfortunately, protocols for communication between FPGAs don't usually support flow control at this level. The only protocol that takes complete care of flow control is Xillyp2p. Other protocols for FPGAs only have a few features that may help when implementing flow control in the application logic.

This page goes through a few techniques for implementing flow control. A brief summary of how Aurora and Interlaken assist with flow control is given later. It's nevertheless important to remember that except for when using Xillyp2p, it's the application logic that is responsible for preventing an overflow.

Flow control techniques

The ultimate goal of flow control is to prevent the situation where the side receiving data gets more data than it can handle. Usually, the receiving side stores the arriving data in buffers (or a FIFO), so it boils down to ensuring that there is space left in those buffers for all data arriving.

Flow control is more difficult to implement for use with MGTs, mainly because the physical channel has a significant delay. Therefore, it takes time for the request to stop sending data to arrive to the transmitter. In addition, there is a certain amount of time from the moment the transmitter stops sending data to when data stops arriving at the receiver.

Another difficulty with MGTs is that bit errors on the physical channel may cause the loss of a request to stop sending data.

In addition to these two difficulties, the data rate of MGTs is high, and there is usually an expectation to use this physical data channel efficiently.

Compared with simpler communication channels, for example a serial port (RS-232), the flow control for an MGT needs to be more sophisticated in order to guarantee that no overflow occurs. The following three techniques are explained below:

XON / XOFF
Time-limited XOFF ("Pause")
Credits

When considering a protocol for use in a project, it's important to ask the following two questions:

What does the application logic need to implement in relation to flow control? The answer is possibly that nothing needs to be implemented, like with PCIe, SuperSpeed USB and Xillyp2p.
If the application data belongs to different channels: Does the flow control mechanism allow pausing communication on each channel separately, or does it affect all traffic through the physical channel?

In-band and out-of-band

Before discussing these three techniques separately, it's important to make a distinction between in-band flow control versus out-of-band flow control.

In any flow control mechanism, the receiving side needs to send requests or information to the transmitting side, in order to regulate the data flow. In many practical usage scenarios, there is a physical data channel in both directions. In other words, the receiving side also has an equivalent physical channel in the opposite direction for transmitting data.

If such a physical channel in the opposite direction exists, the question is whether this channel is used for sending flow control requests. If the channel is used this way, the mechanism is called in-band flow control. Otherwise, out-of-band flow control is applied.

Generally speaking, out-of-band flow control is a less elegant solution, in particular because separate physical wires are required for this purpose. However, if the MGT's channel is unidirectional, this is the only possibility.

This topic is discussed further below in relation to how Interlaken implements flow control.

And now, to the three flow control techniques.

XON / XOFF

XON / XOFF stands for transmit on / off. This is the simplest flow control mechanism, but it has a few significant shortcomings.

This method can be implemented in many ways, but the idea is always the same: The receiver sends some kind of message to the transmitter that means "stop transmitting now" when the receiver can't accept more data. This is what XOFF means. Later on, when the receiver can accept data, an XON is sent in order to request the resumption of the data flow. As these requests are simple, they are suitable for transmission both in-band and out-of-band.

All difficulties mentioned above regarding flow control with an MGT are demonstrated with the XON / XOFF method. The first difficulty is that it takes some time for the XOFF request to arrive at the transmitter: The transmitter continues to send data until this request arrives. In addition, the data that is already on the physical channel when the request arrives will continue to arrive at the receiver.

Therefore, the receiver must send an XOFF early enough so that the data that arrives afterwards can still be handled. But how early is early enough? This depends on the physical channel's round-trip time. In some scenarios, this parameter is known, or at least it's known that the round-trip time is short.

For example, the SATA protocol uses XON / XOFF. This makes sense, as this protocol is intended for communication with hard disks, which are physically close to the SATA controller. If the distance between the two link partners is unknown and possibly large, it can be difficult to define an optimal timing for when to request an XOFF.

Another factor is whether the transmitter can stop immediately when it receives an XOFF. For example, when the transmitted content consists of packets, the protocol may not allow halting the transmission in the middle of a packet. All scenarios must be taken into account when defining the condition for sending an XOFF or an XON.

Another issue is what happens if the XOFF message is lost due to a bit error or some other failure with the physical channel. If this happens, the transmitter may continue the data flow, causing an overflow. The protocol must therefore ensure that the transmission is halted when there is a possibility of a lost XOFF request. See below fow how Interlaken tackles this problem.

To summarize this mechanism: XON / XOFF is based upon an easy concept, but using this method to ensure that an overflow never occurs is unfortunately difficult, and requires careful attention to unexpected scenarios. The credits method, discussed below, is the mirror image: Complicated to understand, yet achieves its goal easily.

Time-limited XOFF ("Pause")

The time-limited XOFF is a variant of the XON / XOFF method: A number is attached to the flow control request. This number indicates for how long the transmitter should pause the data transmission, from the moment it receives this request. More precisely, this is the number of clock cycles that the transmitter shouldn't send data. This method resembles an Ethernet Pause frame.

The advantage of this method is that an XON is not necessary afterwards. This feature can be useful in some specific situations, but even then the advantage is quite small.

The only reason time-limited XOFF is mentioned here is that this method is part of the Aurora Protocol.

Credits

Credits is a limit on how many data elements are allowed for transmission since the communication channel was initialized. This number is sent from the receiver to the transmitter by virtue of some type of control channel, which is defined by the protocol. By doing this, the receiver controls the amount of data sent to it.

This is best explained with a simple example: Let's say that the receiver's buffer initially has the ability to receive 1000 data elements. Accordingly, it sends the transmitter a message saying that the credits is 1000 data elements. The transmitter may send 1000 data elements immediately, or maybe later on. However, as long as the credits isn't updated, the total number of data elements that the transmitter sends doesn't exceed 1000.

Some time goes by, and 500 data elements have arrived at the receiver. While this happened, the application logic consumed 100 data elements. In this situation, there are 400 data words in the buffer, so there is room for additional 600 data elements in the buffer.

At this point, the receiver sends another message, updating the credits to 1100. This reflects the fact that 500 data elements have already arrived, and 600 more are allowed, even if the application logic doesn't consume data from the buffer. Hence the buffer is filled if the transmitter has sent a total of 1100 data elements since the beginning. Another way to look at this: The receiver always increases the credits with the number of data elements that are consumed from the buffer.

This mechanism ensures that an overflow is prevented, because the transmitter never sends more data than is allowed by the receiver. There are however two small delicate issues with this method.

The first issue is that the receiver and transmitter must agree upon a starting point, where the number of transmitted data elements is zero. This requires some kind of initialization procedure, where both sides reset their counters. All protocols that use credits have some kind of initial start procedure. This complicates the protocol, as both sides need to be able to change from one state to another simultaneously.

The second issue is that the data flow should be able to continue forever. The credits increases all the time. How is it possible to send the credits as a number with a limited number of bits? The answer to this question is that it's enough to send the lower part of the binary representation of the credits (i.e. only the LSbs). The transmitter uses the same number of bits to represent the number of data elements it has transmitted.

This is enough because the transmitter uses the credits only for the purpose of calculating the number of data elements that are allowed for transmission. This number is calculated as the credits minus the number of data elements it has already transmitted. This results in a number that can't be larger than the size of the buffer at the receiver. If this number is smaller than 2ⁿ, all bits above the n LSbs are zero. Hence it's pointless to calculate them. It's enough to make the subtraction with only n bits. Therefore, only the lower n bits of the binary representation of the credits is required in the flow control requests.

For example, PCIe's flow control is based upon transmitting the credits as a binary word consisting of 8 bits or 12 bits, depending on which type of buffer the flow control protects.

Using credits has a lot of advantages:

If a flow control request is lost, there is no risk for overflow: The credits give permission to send more data, as opposed to XON / XOFF, which request stopping the transmission. Therefore, if a credits update is lost, the transmitter will send less than allowed. No overflow can occur as a result of that.
The credits mechanism is efficient, even when the physical channel's delay is unknown and possibly large. There is no need to pause the data flow earlier to compensate for the delay of the flow control requests.
The transmitter knows in advance if it can transmit a packet without stopping in the middle due to flow control. It's hence possible to ensure that packets are sent continuously.

Flow control with Interlaken

The Interlaken protocol is briefly introduced on a different page.

For the purpose of discussing flow control, note that this protocol assigns a channel number (usually between 0 and 255) to each burst of data. In other words, the protocol is based upon the concept of multiple streams of application data that share the physical channel. Accordingly, the flow control mechanisms control each stream of application data independently.

This protocol offers two types of flow control mechanisms: In-band flow control and out-of-band flow control (OOBFC). Both are XON/XOFF mechanisms. The status is conveyed by virtue of a single bit for each channel. This bit is '1' when this channel is ready to receive data (XON), and '0' otherwise (XOFF).

The difference between these two mechanisms is how these bits are transmitted: The in-band flow control mechanism relies on the fact that a control word is transmitted before and after each burst. This control word consists of 64 bits, out of which 16 bits are allocated for the purpose of flow control. This allows for up to 16 XON / XOFF requests for each control word. This isn't enough for supporting up to 256 channels, however: Each channel has its own XON / XOFF bit. To solve this, the flow control information is split into several control words. This method is called the calendar. Bit 56 in the control word is called "Reset Calendar". When this bit is '1', the control word contains the XON / XOFF requests in relation to channels 0 to 15. In the control word that follows, the requests for channels 16 to 31 are included, and so on. When all channels have been covered (possibly with only one control word), the sequence is restarted with the help of "Reset Calendar".

When a bit error is detected by virtue of the burst's CRC24, all channels enter the XOFF state. The reason is that the CRC24 also covers the control word, so the flow control part is ignored if an error is detected. It's therefore not safe to transmit on any channel after such an event occurs. Normal operation resumes gradually based upon the flow control requests in the control words that follow.

The main drawback of in-band flow control is that delivery of XON / XOFF requests depends on the data bursts that are transmitted in the same direction. If these bursts are long, or if no bursts are transmitted at all for a period of time, the delivery of XON / XOFF may take longer. This can be partly justified, as it's inevitable that data transport and flow control messages that are sent on the same physical channel compete with each other for bandwidth. However, other protocols usually prioritize flow control messages in a way that ensures a consistent maximal delay.

The out-of-band flow control (OOBFC) alternative avoids the competition with data bursts. With this method, the XON / XOFF requests are transmitted through three additional physical wires (FC_CLK, FC_DATA and FC_SYNC). The XON / XOFF bits for all channels are transmitted in one long frame.

The FC_SYNC signal is high along with the first bit of this frame. The frequency of FC_CLK is between 0 and 100 MHz and DDR clocking is allowed. A CRC consisting of 4 bits (CRC-4) is inserted after 64 XON / XOFF requests or after the last one. If an error is detected by virtue of a CRC, all channels enter the XOFF state.

The choice between in-band flow control and out-of-band flow control depends on the project's requirements, of course.

Interlaken doesn't include any mechanism for multiplexing data from channels. Hence, it is the application logic's duty to arbitrate between the channels' requests to transmit data, and select which channel gets access to transmit a burst (or a whole packet). The application logic is therefore also responsible for pausing transmission for a channel when this is required by an XOFF. The logic that implements the Interlaken protocol is only responsible for sending and receiving XON / XOFF requests.

The protocol mentions credits as a possibility to implement flow control, however only as an open invitation to implement this method by virtue of dedicated channels. There are no details in the protocol on how to implement this.

Flow control with Aurora

The Aurora protocol is briefly introduced on a different page. This protocol suggests two mechanisms for facilitating flow control: NFC and UFC. These two are described separately next.

First, Native Flow Control (NFC) : With this mechanism, the application logic on the receiving side has an interface for sending flow control requests to the transmitter. These requests have two separate parts: An XOFF bit and a time-limited XOFF ("pause") part (both concepts are explained above). The protocol's logic on the transmitter side is responsible for obeying these requests: If the XOFF bit is '0', the transmitter pauses data transmission during the number of clock cycles that corresponds to the 8-bit number included in the flow control request. If this number is zero, data transmission resumes immediately. The numbers in the requests don't accumulate. Rather, each NFC request updates the pause's countdown with a new value.

If the XOFF bit is '1', the transmitter pauses data transmission indefinitely. The transmitter resumes data transmission only in response to a flow control request with XOFF = '0' . Such request is processed as mentioned above.

Note that this flow control mechanism controls all data traffic through the physical data channel. NFC is therefore not suitable for individual control of multiple channels, if such are implemented.

The second mechanism is User Flow Control (UFC) : In practice, this is a separate channel for transmitting messages of up to 256 bytes to the other side. UFC messages have a higher priority than transmission of data, so they reach the other side with low latency.

As the protocol doesn't define the format of these messages, they can be used to implement flow control of any type. Such implementation is made completely in the application logic. UFC messages can be used for transmitting any other kind of status information as well.

Note that none of the messages used in these two flow control mechanisms is protected against bit errors on the physical link. A flow control request can hence arrive incorrectly or not arrive at all, possibly leading to an overflow at the receiver.

As both flow control mechanisms are based upon the data link in the opposite direction, these are available only in full duplex mode.

Summary

In this page, mainly two techniques for flow control were presented: XON / XOFF and credits. For protocols related to computers (e.g. PCIe, SuperSpeed USB and SATA), the flow control is implemented by the protocol logic. On the other hand, for protocols intended for communication between FPGAs, the application logic is responsible for all or the larger part of the implementation. The only exception is Xillyp2p, which takes care of all aspects of the data communication, including flow control, error detection and retransmission.

The flow control related features of two protocols for FPGAs were presented: Interlaken and Aurora. As shown, these protocols leave the larger burden of implementing flow control to the application logic, even though the protocol's own capabilities may do some of the work in certain usage scenarios.

This wraps up the last page in this series about MGTs.