Nvidia Cuda opens door to fast DSP in PCIe PCs

Nvidia’s ‘Cuda’ graphics processors, with their high bandwidth PCIe interface, have opened the door to fast, but simple, instrumentation-grade DSP, according to Spectrum Instrumentation.

Spectrum SCAPP GPU DSP

Spectrum is a maker of digitisers – effectively low-noise multi-channel ADCs, some PCIe-based.

Cuda is an Nvidia parallel number-crunching technology, that allows direct PCIe peripheral to GPU interaction without the host CPU getting involved.


“Currently digitisers have a bottleneck caused by having to use either the host PC’s central processor with 8 or 16 cores [for DSP], or an FPGA that is complex to program,” said Spectrum. “Spectrum has solved this problem with a software option that allows a Cuda-based GPUs to be used directly between any Spectrum digitiser and the PC.”


The software is called SCAPP (Spectrum Cuda access for parallel processing)

“The big advantage is that data is passed directly from the digitiser to the GPU where high-speed parallel processing is possible using the GPU board’s up to 5,000 processing cores,” said the firm. “It becomes even more important when signals are being digitised at high-speeds such as 50Msample/s, 500Msample/s or even 5Gsample/s.”

Parallel computation of filtering, averaging, baseline suppression, FFTs or FFT window functions is expected.

“The structure of a CUDA graphics card fits very well as it is designed for parallel data processing, which is exactly the same as most signal processing jobs,” said Spectrum.

Connecting a large FPGA to the digitiser is traditionally seen as the high-performance DSP option for digitiser boards – and Spectrum offers such a solution – but it claims the direct Cuda GPU link is almost as fast, and far simpler to programme. “VHDL isn’t a skill everybody has,” it said.

Spectrum claims continuous throughput above 3Gbyte/s on PCI Express between a digitiser and GPU: “That is enough to support continuous acquisition from a one channel 8bit digitiser sampling at 2.5Gsample/s or a two channel 14bit unit running at 500Msample/s.”

Cuda cards between 256 and 5,000 processing cores are available (up to 12Tflop) with memory counted in Gbytes.

“A small sized card with 1k cores and 3Tflop is already capable of doing continuous data conversion, multiplexing, windowing, FFT and averaging of two channels 500Msample/s with a FFT block size of 512k – and that can run for hours,” said Spectrum.

The SCAPP package is a driver extension for Spectrum cards, using RDMA (remote direct memory access) for direct data transfer to the GPU.

It includes examples for digitiser to Cuda GPU interraction, a set of Cuda parallel processing examples with building blocks for basic functions like filtering, averaging, data de-multiplexing, data conversion or FFT.

All the software is based on C/C++ and can be changed by someone with “normal programming skills”, said the firm.

It can be used with the M4i platform (16bit 250Msample/s, 14bit 500Msample/s or 8bit 5Gsample/s) as the M2p platform (16bit 20-80Msample/s multi-channel).

There is a video.

Spectrum digitisers are made in Germany.


Leave a Reply

Your email address will not be published. Required fields are marked *

*