For what kind of applications can i use dsp core of beagleboard? Can i use the DSP acceleration for background subtraction algorithm?

安稳与你 提交于 2019-12-05 10:26:34

You can use the DSP for all kinds of computations. It is a general purpose CPU optimized for DSP applications. So yes, even floating point stuff will work albeit the performance will not be great.

The DSP really shines if you do integer computations over large arrays of data. Here the DSP can easily compute so fast that the time to transfer data from and to memory becomes the bottleneck.

To give you a figure what is possible: I have an algorithm running that post-processes data from a camera (doing the bayer deinterleaving). I have 8 bit input images and 24 bit output images. The performance I archive on the beagleboard DSP running at ~ 350Mhz is 144 million pixels per second. That equals to roughly half a gigabyte of processed data per second.

Getting the DSP up and running and compiling a hello world program is not simple though. You have to integrate a DSP kernel driver (I use DSPLINK). You have to learn how to use the (huge) DSP/ARM interop libraries, how to use the toolchain just for a simple hello world. Plan two weeks at least.

Once this works the real work starts by learning how to write fast code for the DSP, how to manage the internal memorys, DMA, interrupts and all such stuff.

In the end it is well worth it because you unlock an incredible fast DSP that can easily outperform the Cortex-A8 if assigned the right job. On top of that you get access to the image co-processors which lets you off-load computations even further. And then there is a complete ARM9 CPU tightly coupled to the DSP that sits there idle and waits to be used as well.

Yes you can, but if it is not part of this OpenCV port project you will have to implement the algorithm by yourself.

The DSP of the BeagleBoard should be powerful enough for image sizes not too big (320x240, maybe 640x480), but you have to deal with fixed point arithmetic and so on in your implementation if you want an optimal throughput.

EDIT: Why fixed point

The TI C6xxx DSP's come in two flavours: smaller numbers (C64xx) do not have a hardware floating-point unit, while higher numbers (C67xx) have one. This is unlike desktop CPUs such as Intel's.

The BeagleBoard-xm embeds a C64xx that has no floating-point unit. Hence, whenever you call a mathematical function that operates on floats, the floating point computations are emulated by the device, which is slow. Instead, maximal throughput is obtained when you implement these operations in fixed point arithmetic because you call native operations on integer types.

fsheikh

The programming model in a heterogeneous platform like beagle board is usually to offload computational part of your application from GPP (ARM) to DSP. You will need a dsp kernel driver and compiler for c64x. For details have a look at DSP BIOS programming guide: http://omappedia.org/wiki/DSPBridge_Project

If you haven't considered it already, I would recommend giving NEON on cortex A8 a try for your image processing algorithm and see what kind of performance boost you get. This is fairly straightforward to program in C without the need of a DSP driver/compiler.

Not applicable to ARM devices, but for people landing here after searching "DSP" and "OpenCV", for high performance in x86 based servers a good choice is the TI c66x CPU series, which has both 32-bit fixed and floating-point instructions. OpenCV has been ported to c66x accelerator cards and runs without issues:

http://processors.wiki.ti.com/index.php/C66x_opencv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!