Synchronize the arrival of data to a DAC from a Zynq Ultrascale+ MPSoC FPGA. The FPGA transmits data samples to a Texas Instruments DAC3283 via an LVDS interface in DDR mode, utilizing LVDS pairs for high-speed communication. LVDS is a widely used signaling standard, valued for its low power consumption, high noise immunity, and suitability for high-performance applications. This synchronization challenge is particularly interesting due to the stringent timing requirements imposed by the DDR mode and the precision required to close timing on the DAC interface at high frequencies.
This article explores two options for synchronizing the interface: applying delays to data signals and adjusting the phase between data and clock.
For this article, a driver was made in VHDL and loaded into the FPGA’s programmable logic. This driver is responsible for reading samples from an internal BRAM and encode the data in a way that the DAC can process the input samples and get the desired analog output signal
The interface between the FPGA and DAC is a parallel bus that has 8 data signals, 2 control signals and a forwarded clock. The clock is used by the DAC to sample the signals.
The next diagram shows an adaptation from figures 34 and 36 from the datasheet, where you can see timing specifications for all data and frame traces. The signal tx_enable is not present in this diagram because its synchronization is not critical for signal integrity.
Looking inside both Alinx AXU4EV and FMC-01 pcb designs, we know that the length of data and clock traces are almost equal. In this case, the maximum difference is around 0.6mm (24mils). This fact is important for the signal propagation, since the traces have almost the same length, you can expect that signals arrive at the same time. If the traces have different lengths, you must take in consideration the time propagation delay.
In the RTL we have 2 clock signals, one for the internal FSM that generates data, and the other is the one we send to the DAC IC (DATACLK). A simple way to follow the timing specifications is to control the delay between the clock edges and the data signals. This can be achieved by either delaying the data signals relative to the clock driving the DAC driver module or delaying the clock relative to the data signals.
Our first approach was using a combo of ODELAYE3 and IDELAYCTRL primitives.
The ODELAYE3 primitive, is a device primitive that delays the output data. It is built of a delay line with 512 taps and a maximum delay of XYZ ps. It allows signals to be delayed on an individual basis. And the IDELAYCTRL primitive must be instantiated when using the ODELAYE3. This module requires a reference clock input that allows internal circuitry to calibrate precise delay tap values independent of PVT (process, voltage, and temperature) for the ODELAYE3 components.
In our case, we are implementing one ODELAYE3 for each output bit and only one IDELAYCTRL.
By using this configuration, you can tune each output bit separately
The IDELAYCTRL primitive has the inputs ports:
And the only output will be connected to each instance of an ODELAYE3 control FSM. This FSM is detailed in the UG571, chapter 2, section ODELAYE3 subsection “DELAY_VALUE Attribute"
And each ODELAYE3 instance will be connected as the next image shows
All the mentioned components in this subsection are connected to the same clock and reset. For the clock speed, you will need to check in the corresponding user guide for your FPGA, in our case the clock must be between 300 and 800MHz.
Let’s discuss the ODELYE3 parameters and connections that don’t appear in the previous image. Since we are not cascading ODELAYs, all CASC_* inputs are connected to a logic 0. Also, we are going to load the delay using de CNTVALUEIN inputs, so we can tie the ‘0’ CE pin low and tie the INC pin high.
As for the parameters:
In order to know if data is arriving correctly, you need to use DAC’s internal pattern checker.
The pattern checker has 8 configurable registers (CONFIG9 to CONFIG16) in which you can leave default patterns or customize them to suit your needs. In register CONFIG8 the result of this logic is allocated where a logic ‘1’ indicates an error in that bit. This is the result of all registers ORed, i.e. CONFIG8[0]=CONFIG9[0]+CONFIG10[0]+ ...+CONFIG16[0].
Now, to enable the pattern checker logic and get the result, you need to write DAC internal registers in the following way:
If there is an error detected, modify the delay in that bit and repeat the process. This process must be repeated until you get a reading of 0x00 in CONFIG8 DAC’s internal register.
This whole process can be cumbersome, so we propose an alternative approach: changing the phase of DATACLK instead of using an ODELAYE3 to tune each output’s delay before the OBUFTDS primitive. In this way, you tune all lanes at the same time. By doing this, you change the DAC’s sampling moment in both flanks and make sure data is stable at those moments. In your first trials, if a phase=90° gives you 0xFF at CONFIG8, it is recommended that you generate several bitstreams with different phases. And try all of them. Now if there is a way to connect an oscilloscope or, better yet, a logic analyzer, you will have a very good guess of what your new phase value should be around.
The diagram below shows the system we built:
Constraints used:
The first line creates a virtual clock on the output pin. The follow four lines configure time limits for max and min delay in rising edge and falling edge (lines with -clock_fall) for the data[*] signals (DDR). Note in all cases that lines where the min delay is defined, you are setting the hold time. And where the max delay, you set the setup time. And, as frame is not a DDR signal, the -clock_fall is not necessary and only the rising edge is constrained.
Why is the hold time negative? Because Vivado sets positive time as time before the clock edge arrives. A negative hold value means that the signal must remain stable after the clock’s rising edge. As we can see in the next image taken from the “Constraints Wizard” located in the implementation menu. Inside this wizard, this image is from the ‘Output delays’ section and you will get it in the waveform tab when the ‘Data Rate and Edge’ is set to ‘Dual’
Where:
Since we estimate that the traces have the same length, we assume that trce_dly_min and trce_dly_max values are 0.
If you are using an MMCM, it is also possible to do a phase fine tuning while the FPGA is working instead of having to generate the entire bitstream. This is achievable by selecting the “Dynamic Phase Shift” in the clocking options, then check the option “Use fine PS” on the desired clock. Note that, when this option is checked, the selected phase will return to 0 degrees. A way to do a fine tuning after you have found the approximate phase follow the next steps, on the “MMCM Settings” tab:
This fine tuning can be useful in order to make debugging without having to regenerate another bitstream.
If you need more information about this mode, you can check this link: https://adaptivesupport.amd.com/s/question/0D52E00006hpsbSSAQ/how-to-use-dynamic-phase-shift?language=en_US
Or the Clocking Wizard Product Guide PG065 document: https://docs.amd.com/r/en-US/pg065-clk-wiz
This article introduced two approaches for synchronizing a DAC with an FPGA. Either by bit or by phase.
https://www.ti.com/product/DAC3283
https://docs.amd.com/r/en-US/ug974-vivado-ultrascale-libraries/ODELAYE3
https://docs.amd.com/r/en-US/ug974-vivado-ultrascale-libraries/IDELAYCTRL
https://docs.amd.com/v/u/en-US/ug571-ultrascale-selectio (version 1.16 english)
Written by Nicolas Bertolo & Adrian Evaraldo
Any Comments or questions, please feel free to contact us: info@emtech.com.ar