



2065-29

#### Advanced Training Course on FPGA Design and VHDL for Hardware Simulation and Synthesis

26 October - 20 November, 2009

**Clock Domains - Multiple FPGA Design** 

Alexander Kluge

PH ESE FE Division CERN

385, rte Mayrin CH-1211 Geneva 23

Switzerland

# Clock domains – multiple FPGA design

#### Clock distribution: multiple FPGAs



#### Clock distribution: multiple FPGAs



#### **Clock distribution**



#### clock distribution/t<sub>co</sub> & t<sub>s</sub> /board 0-> 1



#### Clock distribution



#### clock distribution/t<sub>co</sub> & t<sub>s</sub> /board 1-> 0



#### **Clock distribution**



#### clock distribution/slow output board 0->1



#### clock distribution/fast output board 0->1



#### Clock distribution



#### clock distribution/fast output board 1-> 0



#### clock distribution/slow output board 1-> 0



#### Constraints

- Fulfilling FPGA internal constraints is not sufficient.
- Perform system simulations
- Logic can be too fast



- Data (20 bits) every \* 100 ns
- collision -> L0 (1µs)
- collision -> L2y or L2n (100 μs)



- Data (20 bits) every \* 100 ns
- collision -> L0 (1μs)
- collision -> L2y or L2n (100 μs)

#### Options:

Data pipeline until L2 with FIFO based on shift registers @ 10 MHz

20 bits \* 100 μs / 100 ns

20 bits \* 1000

= 20 000 bits

Data pipeline with FIFO with shift registers
 @ 10 MHz
 20 bits \* 1000 = 20 000 bits



 Data pipeline with FIFO based on dual port RAM @ 10 MHz
 20 bits \* 1000 = 20 000 bits



FPGAs have RAM cells in addition to logic blocks



```
🕥 🕥 🔯 Kastor_extractor.vhd = /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fasto
File Edit Search Preferences Shell Macro Windows
nce/spd/spd_rxcard/ink_rx_card_2004_pascal/verlog_files/V25/verilog_altera/fastor/fastor_extractor.who 4060 bytes L:-- C:---
library icco;
use isee.std_logic_1164.all;
use isee.numeric std.all;
entity fift fastor is
generic (fife depth
            fife plow dth : integer;
            fifo_width
                            integer
                          :in std_logic;
:in std_logic;
:in std_logic;
:in std_logic;
:in std_logic;
:in std_logic_vector (fifo_width-1 downto 0);
:out std_logic_vector (fifo_width 1 downto 0);
:in unsigned (fifo_p.r_width-1 downto 0);
:in ctd_logic_vector
port ( reset i
         write
         read
         date_in
         date_out
         delay
         onable
                           :in std logic
         );
end fife festor;
architecture behavioral of fifo festor is
type nem_array is array (integer range <>) of std_logic_vector(fife_width - 1 downto 0);
signal mem : mem_array(0 to (fifo_depth-1) );-- synthesis syn_ramstyle = "BLOCK_RAM"
attribute syn_remstyle : string;
attribute syn romstyle of mer : signal is "BLOCK RAM";
signal read pointer :unsigned (fife_ptr_width-1 downto 0);
signal write_pointer :unsigned (fife_ptr_width-1 downto 0);
begin
process (clk, reset i)
begin
if (clk'event and clk = 1') then
  if (write = '0') then
   mem(to integer(write pointer))
elsif (enable = 1') then
                                                      \Leftarrow (others => '0');
       mem(to_integer(write_pointer))
                                                      <= data_in;
    end if;
end if;
if \{clk^{\dagger}event, and c'k = 1^{\dagger}\} then
   if (enable = '1') then
        daga_out
                                       <= nem(to_integer(read_pointer));</pre>
    end if;
end if;
if (clk'event and c_k = 1') then
   if (reset_i = '0 ) then
     write_pointer <= (others => 'elsif (write = '1' and enable = '1') then
                                       <= (others => '0');
                                      <= white_pointer - 1;</pre>
       write_pointer
   end if;
end if:
if (clk event and clk = 1) then
   if (reset_i = '0 ) then
     read pointer <= actay;
elsif (read ='1 and enable = '1') then
   read_pointer end if;
                                       read_pointer + 1;
end if;
end process;
end behavioral:
```

```
(altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor.vhd - /Volumes/akluge/cadence/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_extractor_e
 File Edit Search Preferences Shell Macro Windows
nce/spd/spd_rxcard/link_rx_card_2004_pascal/verilog_files/V25/verilog_altera/fastor/fastor_extractor.vhd 4060 bytes_L:--- C:---
library ieee:
use ieee.std logic 1164.all;
use ieee.numeric std.all;
entity fifo fastor is
qeneric (fifo depth
                                                                   integer:
                        fifo ptr width :integer;
                         fifo width
                                                                   :integer
port ( reset i
                                                          in std logic:
                                                          in std logic;
                   clk
                   write
                                                          in std logic:
                                                          :in std logic;
                   read
                                                          :in std logic vector (fifo width-1 downto 0);
                   data in
                                                          :out std_logic_vector (fifo width-1 downto 0);
                   data out
                                                          :in unsigned (fifo ptr width-1 downto 0);
                   delav
                   enable
                                                          :in std logic
end fifo fastor:
architecture behavioral of fifo fastor is
type mem array is array (integer range <>) of std logic vector (fifo width - 1 downto 0);
signal mem : mem array(0 to (fifo depth-1)); -- synthesis syn ramstyle = "BLOCK RAM"
attribute syn ramstyle : string;
attribute syn ramstyle of mem : signal is "BLOCK RAM";
signal read pointer :unsigned (fifo ptr width-1 downto 0);
signal write pointer :unsigned (fifo ptr width-1 downto 0);
begin
process (clk, reset i)
begin
```

```
begin
process (clk, reset i)
begin
if \{clk | event \text{ and } clk = '1'\} then
   if (write = '0') then
      mem(to integer(write pointer)) <= (others => '0');
   elsif (enable = '1') then
      mem(to integer(write pointer)) <= data in:
   end if:
end if:
if (clk') event and clk = '1') then
   if \{enable = '1'\} then
                               <= mem(to integer(read pointer));</pre>
      data out
   end if:
end if;
if (clk') event and clk = '1') then
   if \{reset i = '0'\} then
                                 \leftarrow (others \Rightarrow '0');
      write pointer
    elsif (write = '1' and enable = '1') then
      write pointer
                           <= write pointer + 1;</pre>
   end if:
end if;
if \{clk^{\dagger}event \text{ and } clk = {}^{\dagger}1^{\dagger}\} then
   if (reset i = '0') then
      read pointer
                                 <= delav:
    elsif (read = 1 and enable = 1) then
      read pointer
                               <= read pointer + 1;</pre>
   end if:
end if:
end process:
end behavioral:
```

```
library ieee;
use ieee.std logic 1164.all;
use ieee.numeric std.all;
entity fastor extractor is
                        :integer := 16:
qeneric (fifo depth
         fifo ptr width :integer := 4;
         fifo width
                        :integer := 20
port ( reset i
                        :in std logic:
                        :in std_logic;
       clk
                        :in std_logic_vector (9 downto 0);
       fastor0
       fastor1
                        :in std_logic_vector (9 downto 0);
                        :in std_logic :='0';
       10
       12v
                        :in std logic :='0'
                        :in std_logic :='0';
       12n
       delay 10
                        :in unsigned (3 downto 0) := "11111":
                        :out std logic vector (9 downto 0);
       fastor delayed0
                        :out std_logic_vector (9 downto 0);
       fastor delayed1
       enable
                         :in std Togic
       );
end fastor_extractor;
architecture behavioral of fastor_extractor is
component fifo fastor is
qeneric (fifo depth
                        :integer:
         fifo ptr width : integer;
         fifo width
                        integer
                        :in std_logic;
port ( reset i
                        :in std_logic;
       clk
       write
                        :in std_logic;
                        :in std logic;
       read
       data in
                        :in std_logic_vector (fifo width-1 downto 0);
                        :out std logic vector (fifo width-1 downto 0);
       data out
       delav
                        :in unsigned (fifo ptr width-1 downto 0);
                        :in std logic
       enable
       );
end component;
                         :std_logic_vector (fifo_width-1 downto 0);
signal fastor
signal fastor 10
                        :std_logic_vector (fifo width-1 downto 0);
signal fastor 12
                         :std_logic_vector (fifo_width-1 downto 0);
                        :std_logic;
signal 12yn
```

```
begin
fastor (19 downto 10)
                      fastor1:
fastor (9 downto 0) <= fastor0;
12yn <= 12y or 12n;
fastor delayed1
                  <= fastor 12(19 downto 10);</pre>
fastor delayed0
                     <= fastor 12(9 downto 0);</pre>
fifo_fastor_10: fifo_fastor generic map(fifo_depth,fifo_ptr_width,fifo_width)
                             port map (reset i,
                                      clk
                                               => clk.
                                               => '1',
                                      write
                                      read => '1'
                                      data in => fastor,
                                      data out => fastor 10,
                                      delav => delav \overline{10}.
                                      enable => enable
fifo fastor 12: fifo fastor qeneric map(4,2,20)
                             port map (reset i,
                                      clk
                                               => clk.
                                      write => 10,
                                      read
                                               => 12yn,
                                      data in => fastor 10.
                                      data out => fastor 12,
                                      delay => (others => '0').
                                      enable
                                               => enable
                                      );
end behavioral:
```



Baseline = 384,525ns Cursor-Baseline = -19,600ns



| T: 1 001 005                                 |            |            |           |            |
|----------------------------------------------|------------|------------|-----------|------------|
| TimeA = 364,925ns<br>  365,000ns   366,000ns | 367,000ns  | 368,000ns  | 369,000ns | 370,000ns  |
|                                              |            |            |           |            |
|                                              |            |            |           |            |
| oo► XXooooo                                  |            |            |           |            |
|                                              | П          |            |           |            |
|                                              |            |            |           |            |
| 000000000000000000000000000000000000000      | 0000000000 | 0000000000 | 000000000 | 000000000  |
| 000000000000000000000000000000000000000      | 000000000  | 0000000000 | 000000000 | 0000000000 |
| 00000                                        | )()(00000  |            |           |            |
| 0                                            | (1         |            |           |            |
| 0                                            |            |            |           |            |
| 00000                                        | 55555      |            |           |            |
|                                              |            |            |           |            |



Baseline = 367.150ns Cursor-Baseline = -1937.5ns Name ▼ 🔫 clk ⊶<mark>→</mark>I enable ⊞·····•• fastor ·····-<mark>></mark> 10 -----<mark>----</mark> 12yn 🕀 ······ 🔽 read\_pointer ⊞·····• write\_pointer ⊕----• fastor\_IO ⊞.....• write\_pointer, ⊕ read\_pointer

i fastor\_l2







Baseline = 384.500ns Cursor-Baseline = -17,725ns Name \* :----- clk, ····<mark>→</mark>I enable ⊕-----fastor -----**---**10 ---- 12yn 庄 ····• fastor\_10 ⊕ read\_pointer

i fastor\_l2









# System level simulation

# 6 x 10

# System level simulation

3 x



- 60 ASICs: simplified behavioral
- 40 ASICs: full behavioral
- 5 FPGA: full behavioral
- 7 SRAMs: full behavioral
- 4 PCBs

#### What happens if we have speed problems?

- Often because of inadequate logic architecture/coding style
  - evaluate logic architecture
  - rewrite HDL code to adapt structure to better data throughput
  - insert pipeline structure often one clock cycle more latency does not matter
  - Understand the specifications
  - look for systematics which can help to simplify logic
  - adapt architecture and schematics/code
  - only then optimize placing & routing

#### What happens if we have speed problems?

- Often because of components too small and routing congestion
  - timing constraints
  - Routing constraint placement constraint
  - Use bigger/faster component

#### Conclusion

- FPGA application at CERN
  - data selection/trigger (muon track finder trigger)
  - data processing (pixel detector)
- Design cycle
- Defining Specifications
- Clock domains
- Data delay

#### Additional slides

Alexander.kluge@cern.ch

http://akluge.web.cern.ch/akluge