External Memory Interface Handbook Volume 2: Design Guidelines: For UniPHY-based Device Families

ID 683385
Date 3/06/2023
Public
Document Table of Contents

9.4.1.4.1. Stratix III

This topic details the timing margins, such as the read data and write data timing paths, which the Timing Analyzer callates for Stratix III designs. Timing paths internal to the FPGA are either guaranteed by design and tested on silicon, or analyzed by the Timing Analyzer using corresponding timing constraints.

For design guidelines about implementing and analyzing your external memory interface using the PHY in Stratix III and Stratix IV devices, refer to the design tutorials on the List of designs using Intel FPGA External Memory IP page of the Intel® FPGA Wiki page.

Timing margins for chip-to-chip data transfers can be defined as:

Margin = bit period – transmitter uncertainties – receiver requirements

where:

  • Sum of all transmitter uncertainties = transmitter channel-to-channel skew (TCCS).

    The timing difference between the fastest and slowest output edges on data signals, including tCO variation, clock skew, and jitter. The clock is included in the TCCS measurement and serves as the time reference.

  • Sum of all receiver requirements = receiver sampling window (SW) requirement.

    The period of time during which the data must be valid to capture it correctly. The setup and hold times determine the ideal strobe position within the sampling window.

  • Receiver skew margin (RSKM) = margin or slack at the receiver capture register.

For TCCS and SW specifications, refer to the DC and Switching Characteristics chapter of the Stratix III Device Handbook.

The following figure relates this terminology to a timing budget diagram.

Figure 60. Sample Timing Budget Diagram


The timing budget regions marked “½ × TCCS” represent the latest data valid time and earliest data invalid times for the data transmitter. The region marked sampling window is the time required by the receiver during which data must stay stable. This sampling window comprises the following:

  • Internal register setup and hold requirements
  • Skew on the data and clock nets within the receiver device
  • Jitter and uncertainty on the internal capture clock
Note: The sampling window is not the capture margin or slack, but instead the requirement from the receiver. The margin available is denoted as RSKM.

The simple example illustrated in the preceding figure does not consider any board level uncertainties, assumes a center-aligned capture clock at the middle of the receiver sampling window region, and assumes an evenly distributed TCCS with respect to the transmitter clock pin. In this example, the left end of the bit period corresponds to time t = 0, and the right end of the bit period corresponds to time t = TUI (where TUI stands for time unit interval). Therefore, the center-aligned capture clock at the receiver is best placed at time t = TUI/2.

Therefore:

the total margin = 2 × RSKM = TUI – TCCS – SW.

Consider the case where the clock is not center-aligned within the bit period (clock phase shift = P), and the transmitter uncertainties are unbalanced (TCCSLEAD and TCCSLAG). TCCSLEAD is defined as the skew between the clock signal and latest data valid signal. TCCSLAG is defined as the skew between the clock signal and earliest data invalid signal. Also, the board level skew across data and clock traces are specified as tEXT. For this condition, you should compute independent setup and hold margins at the receiver (RSKMSETUP and RSKMHOLD). In this example, the sampling window requirement is split into a setup side requirement (SWSETUP) and hold side (SWHOLD) requirement. The following figure illustrates the timing budget for this condition. A timing budget similar to that shown is used for Stratix III FPGA read and write data timing paths.

Figure 61. Sample Timing Budget with Unbalanced (TCCS and SW) Timing Parameters


Therefore:

Setup margin = RSKMSETUP = P – TCCSLEAD – SWSETUP – tEXT

Hold margin = RSKMHOLD = (TUI – P) – TCCSLAG – SWHOLD – tEXT

The timing budget illustrated in the first figure with balanced timing parameters applies for calibrated paths where the clock is dynamically center-aligned within the data valid window. The timing budget illustrated in the second figure with unbalanced timing parameters applies for circuits that employ a static phase shift using a DLL or PLL to place the clock within the data valid window.

Read Capture

Memory devices provide edge-aligned DQ and DQS outputs to the FPGA during read operations. Stratix III FPGAs center-aligns the DQS strobe using static DLL-based delays. Stratix III devices use a source synchronous circuit for data capture.

When applying this methodology to read data timing, the memory device is the transmitter and the FPGA device is the receiver.

The transmitter channel-to-channel skew on outputs from the memory device is available from the corresponding device data sheet. Let us examine the TCCS parameters for a DDR2 SDRAM component.

For DQS-based capture:

  • The time between DQS strobe and latest data valid is defined as tDQSQ
  • The time between earliest data invalid and next strobe is defined as tQHS
  • Based on earlier definitions, TCCSLEAD = tDQSQ and TCCSLAG = tQHS

The sampling window at the receiver, the FPGA, includes several timing parameters:

  • Capture register micro setup and micro hold time requirements
  • DQS clock uncertainties because of DLL phase shift error and phase jitter
  • Clock skew across the DQS bus feeding DQ capture registers
  • Data skew on DQ paths from pin to input register including package skew

For TCCS and SW specifications, refer to the DC and Switching Characteristics chapter of the Stratix III Device Handbook.

The following figure shows the timing budget for a read data timing path.

Figure 62. Timing Budget for Read Data Timing Path


The following table lists a read data timing analysis for a Stratix III –2 speed‑grade device interfacing with a 400-MHz DDR2 SDRAM component.

Table 83.  Read Data Timing Analysis for Stratix III Device with a 400-MHz DDR2 SDRAM  (1)   

Parameter

Specifications

Value (ps)

Description

Memory Specifications  (1)

tHP

1250

Average half period as specified by the memory data sheet, tHP = 1/2 * tCK

tDCD

50

Duty cycle distortion = 2% × tCK = 0.02 × 2500 ps

tDQSQ

200

Skew between DQS and DQ from memory

tQHS

300

Data hold skew factor as specified by memory

FPGA Specifications

tSW_SETUP

181

FPGA sampling window specifications for a given configuration (DLL mode, width, location, and so on.)

tSW_HOLD

306

Board Specifications

tEXT

20

Maximum board trace variation allowed between any two signal traces (user specified parameter)

Timing Calculations

tDVW

710

tHP – tDCD – tDQSQ – tQHS – 2 × tEXT

tDQS_PHASE_DELAY

500

Ideal phase shift delay on DQS capture strobe

= (DLL phase resolution × number of delay stages × tCK) / 360° = (36° × 2 stages × 2500 ps)/360° = 500 ps

Results

Setup margin

99

RSKMSETUP = tDQSQ_PHASE_DELAY – tDQSQ – tSW_SETUP – tEXT

Hold margin

74

RSKMHOLD = tHP – tDCD – tDQS_PHASE_DELAY – tQHS – tSW_HOLD – tEXT

Notes to Table:

  1. This sample calculation uses memory timing parameters from a 72-bit wide 256-MB micron MT9HTF3272AY-80E 400‑MHz DDR2 SDRAM DIMM.

Write Capture

During write operations, the FPGA generates a DQS strobe and a center‑aligned DQ data bus using multiple PLL-driven clock outputs. The memory device receives these signals and captures them internally. The Stratix III family contains dedicated DDIO (double data rate I/O) blocks inside the IOEs.

For write operations, the FPGA device is the transmitter and the memory device is the receiver. The memory device’s data sheet specifies data setup and data hold time requirements based on the input slew rate on the DQ/DQS pins. These requirements make up the memory sampling window, and include all timing uncertainties internal to the memory.

Output skew across the DQ and DQS output pins on the FPGA make up the TCCS specification. TCCS includes contributions from numerous internal FPGA circuits, including:

  • Location of the DQ and DQS output pins
  • Width of the DQ group
  • PLL clock uncertainties, including phase jitter between different output taps used to center-align DQS with respect to DQ
  • Clock skew across the DQ output pins, and between DQ and DQS output pins
  • Package skew on DQ and DQS output pins

Refer to the DC and Switching Characteristics chapter of the Stratix III Device Handbook for TCCS and SW specifications.

The following figure illustrates the timing budget for a write data timing path.

Figure 63. Timing Budget for Write Data Timing Path


The following table lists a write data timing analysis for a Stratix III –2 speed‑grade device interfacing with a DDR2 SDRAM component at 400 MHz. This timing analysis assumes the use of a differential DQS strobe with 2.0‑V/ns edge rates on DQS, and 1.0‑V/ns edge rate on DQ output pins. Consult your memory device’s data sheet for derated setup and hold requirements based on the DQ/DQS output edge rates from your FPGA.

.

Table 84.  Write Data Timing Analysis for 400-MHz DDR2 SDRAM Stratix III Device   (1)   

Parameter

Specifications

Value (ps)

Description

Memory Specifications  (1)

tHP

1250

Average half period as specified by the memory data sheet

tDSA

250

Memory setup requirement (derated for DQ/DQS edge rates and VREF reference voltage)

tDHA

250

Memory hold requirement (derated for DQ/DQS edge rates and VREF reference voltage)

FPGA Specifications

TCCSLEAD

229

FPGA transmitter channel-to-channel skew for a given configuration (PLL setting, location, and width).

TCCSLAG

246

Board Specifications

tEXT

20

Maximum board trace variation allowed between any two signal traces (user specified parameter)

Timing Calculations

tOUTPUT_CLOCK _OFFSET

625

Output clock phase offset between DQ & DQS output clocks = 90°.

tOUTPUT_CLOCK_OFFSET = (output clock phase DQ and DQS offset x tCK)/360° = (90° x 2500)/360° = 625

TX_DVWLEAD

396

Transmitter data valid window = tOUTPUT_CLOCK_OFFSET – TCCSLEAD

TX_DVWLAG

379

Transmitter data valid window = tHP - tOUTPUT_CLOCK_OFFSET – TCCSLAG

Results

Setup margin

126

TX_DVWLEAD – tEXT – tDSA

Hold margin

109

TX_DVWLAG – tEXT – tDHA

Notes to Table:

  1. This sample calculation uses memory timing parameters from a 72-bit wide 256-MB micron MT9HTF3272AY-80E 400-MHz DDR2 SDRAM DIMM