01signal.com

Remote Update with Partial Reconfiguration on Vivado

This is the last post in a series of four on Partial Reconfiguration, or Dynamic Function eXchange (DFX) with Xilinx' Vivado. It's mainly targeted for those who want to use Partial Reconfiguration in a Remote Update scenario. This post is written assuming you've already read the previous three.

Introduction

Having discussed Partial Reconfiguration in general, and then Vivado's common flow for this purpose, this post sets the ground for using this technique for Remote Update of the FPGA's logic.

The main issue with this usage scenario is that the partial bitstream needs to be compatible with the initial bitstream that was implemented possibly years before. Hence the original Parent Implementation must be available when implementing the reconfigurable logic, or it must be regenerated to produce exactly the same result — that is, placed and routed exactly the same.

Keeping the entire Vivado project intact to the level that avoids rerunning the Parent Implementation, or obtain exactly the same result if it does, is maybe possible, but surely a shaky way to go.

This can be solved, but in order to understand the suggested solutions, it's necessary to first be familiar with DCPs and OOCs. A brief introduction to both topics follows.

The Design Checkpoint (DCP)

Recall that an FPGA design is implemented in Vivado by virtue of several Design runs, typically named synth_1, impl_1 and also additional runs that are categorized as Out-of-Context (OOC) module runs.

Each such run consists of executing a Tcl script, which generates a temporary "in-memory project", loads design files, sets properties and attributes, and calls Tcl functions that implement synthesis, placing, routing, bitstream generation, and other operations.

This in-memory project creates no files on the disk, and has nothing to do with the Vivado project shown in the GUI. It's an object in memory that allows performing many sequential operations with Tcl commands.

As the implementation progresses, Design CheckPoints (DCP) files are written to the disk (by virtue of the write_checkpoint Tcl command). The content of a DCP file is a snapshot of the in-memory project. In other words, it's a database reflecting the design files that have been loaded and the operations that have been made on the project.

For example, the synth_1 synthesis run may load all HDL, constraint and IP files, and then call synth_design to synthesize the HDL files and create one big netlist from all these sources (including IPs). After that, it calls write_checkpoint to create a DCP file, which is the product of the synthesis. This concludes the synth_1 run.

In fact, the DCP generated by synth_1 is called a netlist DCP even though it usually also contains constraints from XDC files.

The implementation run that follows, typically impl_1, creates a new in-memory project and reads this netlist DCP (among others) as the starting point for the following operations. This run writes several DCP files, each being a snapshot of the project after some processing stage (e.g. optimize design, place, physically optimize, route etc.).

All runs, including synth_1, can (and often do) load DCPs into their in-memory project.

Out Of Context (OOC) Module Runs

After an IP's configuration has been set with the relevant GUI tool, Vivado generates source files (primarily HDL and constraint files) and performs a synthesis on these. This results in a netlist DCP, which is then loaded in the main project's synthesis and implementation runs. This reduces the overall implementation time, by eliminating the need to regenerate the sources of the IPs and synthesize them over and over again.

The Vivado Run that takes the raw products (an IP configuration, some HDL files, or whatever there is) and turns that into a netlist DCP, is called an Out-Of-Context run in Vivado's vocabulary. And nowhere else, for that matter. It's actually a rather poor choice of name.

This netlist DCP is loaded by the synthesis and implementation runs, usually by virtue of the read_ip Tcl command, which usually boils down to loading the DCP that has been generated by the OOC run.

Vivado also allows selecting an HDL module in the main project, and request to synthesize it as an OOC (by right-clicking the source in the Project Manager's source tree and pick "Set as Out-of-Context for Synthesis...").

The drawback of OOCs is that the synthesizer can perform certain optimizations across module boundaries when it gets the entire design as a single project. So the individual synthesis by virtue of OOCs has a potential performance hit.

OOCs and DCPs in the Partial Reconfiguration flow

When a source file has been set as the top-level for a reconfigurable module, Vivado creates an OOC run for the synthesis of this module and the modules it instantiates. This run generates a netlist DCP for use in the relevant implementation. This is true for the Parent as well as Child Implementations: The reconfigurable module is always represented with a separate DCP.

This netlist DCP is applied differently in Parent and Child Implementations however: The netlist DCP that is assigned to the Parent Implementation is loaded by synth_1 as well as impl_1, just like it does with an IP in a regular implementation. Partial Reconfiguration influences this flow mostly through the placement constraints imposed by floorplanning.

The Child Implementation, on the other hand, has no dedicated synthesis phase. Rather, it mixes the reconfigurable module's netlist DCP with the final, routed, DCP from the Parent Implementation. More precisely, the routed DCP in use is the final DCP that has had its reconfigurable logic part removed. A bit like removing the middle part of the vegetable for making stuffed zucchini.

And now is the time to break this down into Tcl commands.

The nuts and bolts of Parent-Child implementations

The place to look for how the implementation works under the hood is in the implementation's directory, in the Tcl file having the name of the project (and a .tcl suffix). In particular, the implementation script for the child implementation is interesting. It also so happens that Chapter 3 ("Vivado Software Flow") in the related user guide, UG909, walks through that script, even if not saying so directly.

As just mentioned, there isn't very much special about the Parent's implementation, except that it relies on a DPC for the netlist of its reconfigurable module, and that floorplanning constraints apply. It's more or less like hierarchical design.

But then the Parent Implementation writes two bitstreams rather than one, by virtue of Tcl commands like this:

write_bitstream -force -no_partial_bitfile theproject.bit 
write_bitstream -force -cell pr_block_ins pr_block_ins_lpf_partial.bit

and then it creates the DCP for use by the Child Implementation:

update_design -cell pr_block_ins -black_box
lock_design -level routing
write_checkpoint -force theproject_postroute_physopt_bb.dcp

Recall that while this part is running, there is an in-memory project, which started with joining netlist DCPs, and went through place and route and all other optimizations. These three Tcl lines execute after writing the bitfile, so the in-memory project is at the really final stage.

Which is the right time to punch a hole and do the stuffed zucchini thing: The update_design command turns the reconfigurable module into a black box. In other words, all its logic is removed, giving room for other logic to come in.

Then the place and route of the design is locked by virtue of the lock_design command. After which the project's snapshot is written into theproject_postroute_physopt_bb.dcp. "bb" stands for "Black Box", of course.

The relevant part in the Child Implementation script goes:

create_project -in_memory -part xc7k325tffg900-2
set_property design_mode GateLvl [current_fileset]
add_files -quiet .../impl_1/theproject_postroute_physopt_bb.dcp
add_files -quiet .../two_synth_1/pr_block.dcp
set_property SCOPED_TO_CELLS pr_block_ins [get_files .../bpf_synth_1/pr_block.dcp]
link_design -top theproject -part xc7k325tffg900-2 -reconfig_partitions pr_block_ins
opt_design 
write_checkpoint -force theproject_opt.dcp
[ ... ]

and from this point it goes on to place and route etc.

Note that the implementation script above consumed just two sources, both DCPs: theproject_postroute_physopt_bb.dcp, which is the static logic, placed, routed and locked, and pr_block.dcp, which is the reconfigurable module's netlist (i.e. after only synthesis). As the implementation continues, the fact that the former DCP was locked ensures that nothing of the static logic moves, but the reconfigurable module is placed and routed as usual based upon the netlist DCP.

As a side note, if there are XCI IPs belonging to the reconfigurable module, a line like the following is added for each, between the two add_files commands above:

read_ip -quiet .../theproject.srcs/sources_1/ip/blkmem/blkmem.xci

This isn't special to Partial Reconfiguration, though — it's done this way in any implementation.

Just before writing the two bitstreams, the Child Implementation verifies that the routed design it has obtained is compatible with the the static logic (as placed and routed) of the Parent Implementation.

The Tcl command for this is something like:

pr_verify -full_check -initial /path/to/impl_1/theproject_postroute_physopt.dcp -additional /path/to/child_1_impl_1/theproject_routed.dcp -file child_1_impl_1_pr_verify.log

Note that this compares two DCP files, regardless of the in-memory project. The Parent Implementation's routed DCP is compared with the final DCP of the Child Implementation. The output of this comparison goes to a file named *_pr_verify.log.

This comparison guarantees that the partial bitstream is compatible in the sense that it can be loaded when the parent's bitstream already is. It goes through the static partition's logic resources as well as the routing that is used.

pr_verify returns failure if there's an incompatibility, which prevents the bitstreams generation in Vivado's script. It's important to keep this step in mind when writing custom implementation scripts.

There's no reason this verification should ever fail, but if it does, the bitstream generation fails with a lot of errors like "ERROR: [Constraints 18-891] HDPRVerify-08: design check point .../impl_1/theproject_postroute_physopt.dcp places instance ... at site SLICE_X118Y125, yet design check point .../impl_2/theproject_routed.dcp does not. Both check point must have the same static placement result".

These errors messages will probably reach the limit of 100, and then get silenced.

Solution for the Remote Update scenario

Recall from above that the challenge is that the output of the original Parent Implementation must be available when implementing the reconfigurable logic, as a Child Implementation.

The brute-force solution is to make a copy of the entire Vivado project directory, along with any files it might depend on. When the need to generate a new partial bitstream arises, restore all files, force the Parent Implementation as up-to-date, and create a bitstream for the Child Implementation only. This flow is technically OK, but is likely to be annoying in the long run. If you choose this way, be sure to manually run pr_verify against the original Parent Implementation's DCP file, because this flow will not detect an unintentional change in the Parent Implementation.

There are two other alternatives, which are based upon the knowledge on how the Parent and Child Implementations interact, namely through two DCP files, as explained above. The obvious advantage of these is that you know what you're doing.

The principle behind both alternatives is to ensure that the partial bitstream relies upon the two DCP files that were generated along with the initial bitstream, which I shall refer to as the Golden DCPs.

The first alternative is to implement the partial bitstream as a non-project Tcl script. Essentially, it means running the script that Vivado creates for the Child Implementation, however modifying it to use the Golden DCPs. More precisely, modify the link_design and pr_verify commands' arguments, so that they rely on the Golden DCPs.

The main drawback of this alternative is that running a non-project Tcl script doesn't integrate well with Vivado's GUI, so it becomes considerably more difficult to process messages, open the implemented design for review etc.

The Golden DCPs hack

Ideally, it would have been possible to automatically modify the script that is generated by Vivado for the Child Implementation run, so it would relate to the Golden DCPs. It doesn't seem like there's a robust way to do that.

However Vivado allows defining Tcl scripts for execution before and after certain stages in the implementation. These scripts are executed from the implementation run's script, so they can't be used to manipulate the latter script.

Nevertheless, this opens for an somewhat ugly hack, which is the second alternative: To overwrite the DCPs of the Parent Implementation with the Golden DCPs. By doing so, the Child Implementation runs normally, but relies on the Golden DCPs, regardless of what the Parent Implementation happened to generate.

The advantage of this hack is that the regular development flow is maintained: Changes are made to the reconfigurable logic, Vivado reruns the OOC for synthesizing the reconfigurable module, and then the Child Implementation for generating the partial bitstream. Since no changes are made in the static logic, Vivado has no reason to launch its related runs.

The Tcl script that implements this Golden DPC copy hack is

if { [catch {
    set parentimpldir "[ file normalize "../impl_1"]"
    set goldendir "[ file normalize "/path/to/golden"]"

    file copy -force "[file normalize "$goldendir/theproject_postroute_physopt_bb.dcp"]" "$parentimpldir/"
    file copy -force "[file normalize "$goldendir/theproject_postroute_physopt.dcp"]" "$parentimpldir/"
} errmsg ] } {
    send_msg_id golden-reconfig-1 error "Failed to copy golden parent reconfiguration file(s): $errmsg"
    return -code error
}

This script assumes that the Parent Implementation is kept in the "impl_1" directory adjacent to the Child Implementation directory (it most likely is) and that the two Golden DCPs are stored in a directory as assigned in line 3 of this script.

Right-click the child run (say, child_0_impl_1) at the Design Runs tab, pick "Change Run Settings..." and in the dialog that opens, set tcl.pre for Design Initialization (init_design) to the script. Or, in Tcl, assuming the script was saved as golden_pr.tcl:

add_files -fileset utils_1 -norecurse /path/to/golden_pr.tcl
set_property STEPS.INIT_DESIGN.TCL.PRE [ get_files /path/to/golden_pr.tcl -of [get_fileset utils_1] ] [get_runs child_0_impl_1]

The important thing to keep in mind when using this script is that the Parent Implementation's information and outputs are rubbish. Vivado may open its Implemented Design GUI and show its reports, but all this may very well be completely unrelated. So this is an opening to some confusion.

A second possible annoyance is that Vivado may run the Parent Implementation every now and then in response to a change in global settings, or even the constraints file. This can be worked around by right-clicking the relevant row in the Design Runs tab, and choose "Force Up-to-Date". This menu appears only when the run is in Out-of-Date state, i.e. completed but Vivado deems a refresh is needed. The related Tcl command is e.g.

set_property needs_refresh false [get_runs synth_1]

What to save along with the Golden DCPs

Clearly, the Golden DCPs must be kept in safe place to maintain the capability to generate compatible bitstream files in the future. Along with these, it's a actually a good idea to compress the entire Vivado project into a .tar.gz / .zip file for a quick resumption. Alternatively, or in addition to that, a project archive can be generated with File > Project > Archive... or something like

archive_project /path/to/theproject.xpr.zip -force -include_local_ip_cache -include_config_settings

Note however that both the project file as well as others in a Vivado project may contain absolute paths to files, so deploying the project in another directory, or on another computer, may not work as expected. This is true for the archive as well.

Which Vivado version is used is written in all possible report files and the .xpr project file, but writing that down won't hurt. Possibly retain a working copy of that version of Vivado, even through there's no apparent reason why a software upgrade would matter. It's just that mixing the Golden DCPs from one Vivado version with the reconfigurable logic netlist DCP from another, doesn't seem to be on the official list of supported features. Even though it will probably work OK.

The minimal set of files is:

Note that the usage of clearing bitstream on Ultrascale devices (only), relates to the logic already in the FPGA. That's why the clearing bitstream that relates to the initial bitstream in use must be saved. In principle, both could be outputs of some other Child Implementation — the initial bitstream doesn't have to be the Parent's.

As for saving the sources of the Parent Implementation, it's important because it determines the connections between the static and reconfigurable logic. For example, if the sources are edited, and a port is added to the reconfigurable logic and is referred to in its instantiation, the set of partition pins changes. Hence the netlist of the reconfigurable logic will have external pins that don't appear in the original static design.

When such mismatch occurs, the child implementation fails with an error message reading something like "ERROR: [Netlist 29-77] Could not replace (cell 'pr_block_bb', library 'work_pr_block_ins_pr_block_ins_4', file 'NOFILE') with (cell 'pr_block', library 'work', file 'pr_block.edf') because of a port interface mismatch; in strict mode, no extra ports are allowed. 8 ports are missing on the original cell. 5 of the missing ports are: 'thingy[7]' 'thingy[6]' 'thingy[5]' 'thingy[1]' 'thingy[0]'".

What actually fails in this case is the link_design command (see above), which glues static and combinatoric logic together.

The simple and obvious way to avoid this problem is not to change the static part of the design, and neither the port list of the reconfigurable logic.

Having said that, it's actually OK to make changes in the static part of the project: It's just the instantiation of the reconfigurable module that must remain the same. As long as the implementation goes through smooth, including the verification at the end, there is no problem.

Summary

Even though there isn't a fully smooth solution for the Remote Update usage of Partial Reconfiguration, there are a few available strategies to achieve this goal nevertheless.

The important point is that in principle, the two Golden DCPs is all that is needed to build and verify a partial bitstream for an existing project.

And regardless how hacky the strategies presented here might seem, keep in mind that the verification carried out by pr_verify is a comprehensive check for the compatibility between the partial bitstream and the static logic already in place. As long as the correct Golden DCP is used for this verification, and the test passes, there is nothing else to worry about. Plus that the Child Implementation meets timing, but that goes without saying.

Copyright © 2021-2022. All rights reserved. (42e6e8c4)