fpga | 易学教程

How to check time performances in a C++ program on Zedboard

阅读更多关于 How to check time performances in a C++ program on Zedboard

问题 I have implemented a C++ code on a Zedboard. It compiles and runs perfectly, but now i would like to check the performances in order to optimize some functions. I have checked some threads here (Testing the performance of a C++ app) and here (Timer function to provide time in nano seconds using C++), but i don't really understand how to apply it mon code ... To make things clear : I'm not good at C++, I have never really learned the language formally but only used it several times with

Error adding std_logic_vectors

阅读更多关于 Error adding std_logic_vectors

问题 I wanna have a simple module that adds two std_logic_vectors. However, when using the code below with the + operator it does not synthesize. library IEEE; use IEEE.std_logic_1164.all; use IEEE.std_logic_arith.all; entity add_module is port( pr_in1 : in std_logic_vector(31 downto 0); pr_in2 : in std_logic_vector(31 downto 0); pr_out : out std_logic_vector(31 downto 0) ); end add_module; architecture Behavior of add_module is begin pr_out <= pr_in1 + pr_in2; end architecture Behavior; The error

Using a continous assignment in a Verilog procedure?

阅读更多关于 Using a continous assignment in a Verilog procedure?

问题 Is it possible and/or useful to ever use a continuous assignment in a Verilog procedure? For example, would there ever be any reason to put an assign inside of an always block? For example this code: always @(*) begin assign data_in = Data; end Furthermore would it be possible to generate sequential logic with this approach? always @(posedge clk) begin assign data_in = Data; end 回答1: It is called procedural continuous assignment . It is the use of an assign or force (and their corresponding

Using a continous assignment in a Verilog procedure?

阅读更多关于 Using a continous assignment in a Verilog procedure?

FPGA串口通信

阅读更多关于 FPGA串口通信

`timescale 1ns / 1ps ////////////////////////////////////////////////////////////////////////////////// // Engineer: ian // // Create Date: 2019/11/30 14:51:50 // Design Name: uart // Module Name: uart_tx // Project Name: // Target Devices: zynq 7020 // Tool Versions: vivado 2018.3 // Description: This this a uart transmit module // // Dependencies: // // Revision: v1.0 ////////////////////////////////////////////////////////////////////////////////// //********************************************************************************** //input port: sys_clk //system clock // sys_rst_n /

FPGA学习日记2

阅读更多关于 FPGA学习日记2

1.Quartus Prime和Quartus II区别解析：Quartus Prime是Altera被Intel收购后在之前开发的Quartus II的基础上开发的新软件。 2.FD-SOI技术解析：FD-SOI技术又称完全耗尽型绝缘体上硅，是一种平面工艺技术，具有减少硅几何尺寸同时简化制造工艺的优点。 3.Stratix V系列解析：Stratix V系列FPGA采用新的存储器体系结构，降低延时，高效实现FPGA业界最好的系统性能。Stratix V FPGA为网络设备生产商提供存储器接口解决方案，支持在互联网上迅速有效的传送视频、语音和数据。 4.Block Diagram/Schematic文件解析：原理图文件。 5.EDIF文件解析：网表文件。 6.Qsys system文件解析：用于设计软核，Qsys前身是NIOS。 7.State Machine文件解析：状态机文件。 8.System Verilog文件解析：用于系统级验证。 9.Tcl script文件解析：TCL脚本文件。 10.Probe文件解析：用于观察FPGA内部某一信号，一般用Signaltap。 11.VWF文件解析：用于调用quartus自带的仿真工具QSIM。 12.原理图Symbol文件解析：用于编辑原理图Symbol，跟用电路图设计软件时，画原理图库差不多。 13

神经网络加速器设计

阅读更多关于神经网络加速器设计

基于FPGA的深度学习加速器优化（一）神经网络加速神经网络基于FPGA的神经网络加速器设计高速度高能效神经网络加速神经网络算法相较于传统的深度学习算法在很多方面都有较大的优势，不同的网络模型在图像、视频以及语音等领域发挥出越来越重要的作用，例如卷积神经网络CNN、递归神经网络RNN等等。但对于神经网络模型来说，其计算量和存储需求是非常大的，如下表所示：因此，针对基于神经网络的应用选择合适的计算平台是非常重要的。典型的CPU平台计算能力约为10-100G FLOP/s，能效通常低于1GOP/J。因此CPU平台既不能满足云端应用高性能的需求也不能满足移动端低能耗的要求。而GPU平台的峰值算力可以达到10TOP/s，同时像Caffe、TensorFlow等框架也提供了针对GPU加速的接口，非常适合于高性能要求的应用，然而其能效比较低。而FPGA平台相较于CPU、GPU具有很高的能效比，同时FPGA灵活性比较高，相较于AISC的实现方式，成本较低，周期短。但同时基于FPGA的神将网络加速也面临着一些挑战：（1）FPGA平台相较于GPU其存储空间、I/O带宽、计算资源等通常都是有限的；（2）目前大多数FPGA平台的工作频率通常在100-300MHz之间，远远低于CPU或者GPU平台工作频率；（3）神经网络算法在FPGA上的部署难度要大大高于CPU或者GPU平台

In VHDL … how to count leading zeros of vector?

阅读更多关于 In VHDL … how to count leading zeros of vector?

问题 I'm working in a VHDL project and I'm facing a problem to calculate the length of vector. I know there is length attribute of a vector but this not the length I'm looking for. For example, I have std_logic_vector E : std_logic_vector(7 downto 0); then E <= "00011010"; so, len = E'length = 8 but I'm not looking for this. I want to calculate len after discarding the left most zeros , so len = 5; I know that I can use for loop by checking "0"s bits from left to right and stop if "1" bit occur.

FIFO implementation - VHDL

阅读更多关于 FIFO implementation - VHDL

问题 I come across one more difficulty while instantiate the fifo code to my top module. I want to store some set of data say "WELCOME TO THE WORLD OF FPGA" from my serial port ( receiving subsystem) then i want to retrieve it back say when button on fpga board is pressed or FIFO is full. I have my fifo code and serial communication code written. Idea is data sent from keyboard ->receiving subsystem -> FIFO -> transmitting subsystem -> hyperterminal. I am at present using fifo of 8 bit wide and

modified baugh-wooley algorithm multiply verilog code does not multiply correctly

阅读更多关于 modified baugh-wooley algorithm multiply verilog code does not multiply correctly

问题 The following verilog source code and/or testbench works nicely across commercial simulators, iverilog as well as formal verification tool (yosys-smtbmc) Please keep the complaint about `ifdef FORMAL until later. I need them to use with yosys-smtbmc which does not support bind command yet. I am now debugging the generate coding since the multiplication (using modified baugh-wooley algorithm) does not work yet. When o_valid is asserted, the multiply code should give o_p = i_a * i_b = 3*2 = 6