Posted

You don't need to dig into the FPGA vendor's library of primitives in order to get a RAM. The tools are often smart enough to figure out what you want and will infer a RAM if you write your code the correct way.

The code below is a basic example of how to write VHDL code which will be interpreted by Vivado, Quartus, Diamond, etc. as a RAM.

-------------------------------------------------------------------------------
--
-- Copyright (c) 2019 CadHut
-- All rights reserved.
--
-- Redistribution and use in source and binary forms is permitted.
--
-------------------------------------------------------------------------------
-- Project Name  : CadHut Training
-- Author(s)     : Iain Waugh
-- File Name     : ram_dp_sync.vhd
--
-- Infer a basic synchronous dual-port RAM
--
-------------------------------------------------------------------------------

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity ram_dp_sync is
  generic (
    G_DATA_WIDTH : integer := 36;       -- Input / Output data width
    G_LOG2_DEPTH : integer := 9         -- log2( Memory Depth )
    );
  port(
    clk : in std_logic;

    i_addr_a : in std_logic_vector(G_LOG2_DEPTH - 1 downto 0);
    i_wr_en  : in std_logic;
    i_data_a : in std_logic_vector(G_DATA_WIDTH - 1 downto 0);

    i_addr_b : in  std_logic_vector(G_LOG2_DEPTH - 1 downto 0);
    o_data_b : out std_logic_vector(G_DATA_WIDTH - 1 downto 0)
    );
end ram_dp_sync;

architecture ram_dp_sync_rtl of ram_dp_sync is

  type t_ram is array (natural range <>) of std_logic_vector(G_DATA_WIDTH-1 downto 0);
  signal ram : t_ram(0 to 2**G_LOG2_DEPTH - 1);

begin  -- ram_dp_sync_rtl

  u_ram : process (clk)
  begin
    if(rising_edge(clk)) then
      o_data_b <= ram(to_integer(unsigned(i_addr_b)));

      if(i_wr_en = '1') then
        ram(to_integer(unsigned(i_addr_a))) <= i_data_a;
      end if;
    end if;
  end process u_ram;

end ram_dp_sync_rtl;
What's going on here? The RAM itself is a 2D array, which is G_DATA_WIDTH bits wide and is 2^G_LOG_DEPTH deep. The memory is held in a signal called ram.

Whenever you want to write to it, you put the data value on i_data_a, the address on i_addr_a and you raise the i_wr_en write enable strobe high for 1 clock cycle.

Reading is much easier; you put the address on the i_addr_b address pins and the data appears at the o_data_b output some time later.

That's not too bad, but it could be better. The code above does not let you decide whether you want to use distributed RAM (made from LUTs) or if you want to use dedicated FPGA memory resources. It also doesn't allow you to have any control over whether the RAM's output is registered or not, so you probably need to add extra flip-flops yourself if you want to run at a decent clock speed (>200MHz, say).

A more practical version of the code is shown below.

-------------------------------------------------------------------------------
--
-- Copyright (c) 2019 CadHut
-- All rights reserved.
--
-- Redistribution and use in source and binary forms is permitted.
--
-------------------------------------------------------------------------------
-- Project Name  : CadHut Training
-- Author(s)     : Iain Waugh
-- File Name     : ram_dp_sync.vhd
--
-- Infer a basic synchronous dual-port RAM with an optional registered output
-- and a choice of how the RAM is implemented
--
-------------------------------------------------------------------------------

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity ram_dp_sync is
  generic (
    G_DATA_WIDTH : integer := 36;       -- Input / Output data width
    G_LOG2_DEPTH : integer := 9;        -- log2( Memory Depth )

    G_REGISTER_OUT : boolean := true;

    -- RAM styles:
    -- Xilinx: "block" or "distributed"
    -- Intel/Altera: "logic", "M512", "M4K", "M9K", "M20K", "M144K", "MLAB", or "M-RAM"
    -- Lattice: "registers", "distributed" or "block_ram"
    G_RAM_STYLE : string := "block"
    );
  port(
    clk : in std_logic;

    i_addr_a : in std_logic_vector(G_LOG2_DEPTH - 1 downto 0);
    i_wr_en  : in std_logic;
    i_data_a : in std_logic_vector(G_DATA_WIDTH - 1 downto 0);

    i_addr_b : in  std_logic_vector(G_LOG2_DEPTH - 1 downto 0);
    o_data_b : out std_logic_vector(G_DATA_WIDTH - 1 downto 0)
    );
end ram_dp_sync;

architecture ram_dp_sync_rtl of ram_dp_sync is

  type t_ram is array (natural range <>) of std_logic_vector(G_DATA_WIDTH-1 downto 0);
  signal ram : t_ram(0 to 2**G_LOG2_DEPTH - 1);

  -- Xilinx
  attribute RAM_STYLE        : string;
  attribute RAM_STYLE of ram : signal is G_RAM_STYLE;

  -- Intel/Altera, Lattice
  attribute SYN_RAMSTYLE        : string;
  attribute SYN_RAMSTYLE of ram : signal is G_RAM_STYLE;

  signal data_b : std_logic_vector(G_DATA_WIDTH - 1 downto 0);

begin  -- ram_dp_sync_rtl

  u_ram : process (clk)
  begin
    if(rising_edge(clk)) then
      data_b <= ram(to_integer(unsigned(i_addr_b)));

      if(i_wr_en = '1') then
        ram(to_integer(unsigned(i_addr_a))) <= i_data_a;
      end if;
    end if;
  end process u_ram;

  -- Either register the outputs or pass them straight through.
  -- Logic runs faster when registered, but there's a 1-cycle penalty.
  out_not_registered : if G_REGISTER_OUT = false generate
    o_data_b <= data_b;
  end generate out_not_registered;

  out_registered : if G_REGISTER_OUT = true generate
    u_reg_out : process (clk)
    begin
      if(rising_edge(clk)) then
        o_data_b <= data_b;
      end if;
    end process u_reg_out;
  end generate out_registered;

end ram_dp_sync_rtl;
The main enhancement here is that we apply a special attribute to the ram signal. The attribute's name changes between vendors because they are not standardised, but there's no harm in specifying both types.

If you want your design to run at a decent speed, it's almost certain that you will want to turn on the output registers by setting G_REGISTER_OUT true.

Author
Categories

Posted

If you’ve designed for FPGA targets before, you’re probably familiar with using the GUI to point to your source code, set project parameters, synthesise and place & route your design.

What do you do if you want to tell someone exactly how you got from A to B? Do you remember every build option you selected for a file you made a week ago? What happens if someone asks you to repeat what you did, but on their machine? A “repeatability” problem exists between the keyboard and the chair; the human.

This presentation covers how you set up an FPGA project so that you can type “make” from the command-line and end up with a bitfile that you can upload to the FPGA.

AutomatingBuildFlow.pdf [0.18 MB]

Once you’ve automated your build process, you can point someone to the project’s make file and know they’ll be getting exactly the same build parameters that you have. You can also start looking at more advanced ways of building your projects, like setting up a continuous integration system.

Author
Categories ,

Posted

Imagine you’ve bought an FPGA development board. After a while, you’re bored with the demo functions and you want to start your own design from scratch. You don’t want your board to power up and have half-lit LEDs or annoying buzzers squealing, so you want to set up the output signals correctly. This article walks through one possible set of steps that you can take to get there.

NewDesignTopDown.pdf [0.19 MB]

Author
Categories ,