How to deduce from synthesis report

风流意气都作罢 提交于 2019-12-04 09:48:24

Since you're using Xilinx, I presume you also have access to PlanAhead? Try "Analyze Timing / Floorplan Design (PlanAhead)" (under "Implement Design" -> "Place & Route").

PlanAhead should open, and give you a view of your timing results in the bottom. Pick the critical path (the one with the least slack), right click it and choose "Schematic", which will bring up a graphical view of the involved primitives. You can then right-click the primitives and choose "Expand Cone" -> "To Flops" to get a view of the surrounding components too.

This should help you get a much better idea of what signals are involved. Try tracing the input and output signals to your VHDL code, and focus on that path for optimization.

There will be no good answers from this information only; we can only guess what source code produced this hardware.

But it is clear that you need to examine the source, make a hypothesis why it is slow, take action to correct the problem, and test the solution.

And repeat until fast enough.

My guess, given your hint that there is a case statement to decode the opcodes...

one of the arms is something like:

when <some expression involving decode>  =>
   address <= <some address calculation>;

The problem is that often the two expressions are inter-related so that they are evaluated in the same cycle. An example solution would be to precompute the address expression (i.e. in the previous cycle) into a register, and rewrite the case arm as:

when <some expression involving decode>  =>
   address <= register;

If you guessed right, the result will be slightly faster and you have another (similar) bottleneck to fix. Repeat until fast enough...

But without the source AND the timing analysis, don't expect a more specific answer.

EDIT : having posted a fraction of source code, the picture is a little clearer : you have two nested Case statements, each quite large. You clearly need some simplification...

I note that only 2 of the inner case arms assign to i_ram_addr, yet the timing analysis shows a huge and complex mux on i_ram_addr; clearly there are a lot of other case arms that contribute terms to i_ram_addr...

I would suggest that you might have to treat i_ram_addr separately from the main Case statement and write the simplest machine you can to generate i_ram_addr alone. For example I would note that the OPCODE case arm is equivalent to:

if OPCODE(7 downto 3) = "11101" then ...

and ask how simple you can get a decoder for i_ram_addr alone. You may find that a lot of other case arms do very similar things with i_ram_addr (the original 8051 designers would have jumped at the chance to simplify logic!). Synthesis tools can be quite clever at simplifying logic, but when things get too complex they can miss opportunities.

(At this stage I would comment out the i_ram_addr assignments and leave the rest of the decoder alone)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!