Single chip delivers 1PetaOps/sec

Stealthy, three year-old, processor start-up Groq says it has developed a single chip architecture which can deliver 1PetaOps/sec performance.

Groq calls its architecture Tensor Streaming Processor (TSP). Two years back it said it had recruited eight of the ten people who developed Google’s Tensor Processing Unit (TPU).

The company has raised $62.3 million in funding.

Groq’s architecture is equivalent to one quadrillion operations per second, or 1e15 ops/s and capable of up to 250 trillion floating-point operations per second (FLOPS).

“Top GPU companies have been telling customers that they’d hoped to be able to deliver one PetaOp/s performance within the next few years; Groq is announcing it today,” says Groq CEO Jonathan Ross, “the Groq architecture is many multiples faster than anything else available for inference, in terms of both low latency and inferences per second. We had first silicon back, first-day power-on, programs running in the first week, sampled to partners and customers in under six weeks, with A0 silicon going into production”

With a software-first mindset, Groq’s TSP architecture claims to achieve both compute flexibility and massive parallelism without the synchronization overhead of traditional GPU and CPU architectures.

Groq’s architecture can support both traditional and new machine learning models, and is currently in operation on customer sites in both x86 and non-x86 systems.

The architecture is designed specifically for the performance requirements of computer vision, machine learning and other AI-related workloads.

Execution planning happens in software, freeing up silicon real estate otherwise dedicated to dynamic instruction execution.

The tight control provided by this architecture provides deterministic processing that is especially valuable for applications where safety and accuracy are paramount.

Compared to complex traditional architectures based on CPUs, GPUs and FPGAs, Groq’s chip also streamlines qualification and deployment, enabling customers to simply and quickly implement scalable, high performance-per-watt systems.

Single chip delivers 1PetaOps/sec

Recommended Articles

Leave a Reply Cancel reply

Work Break

Sudoku

Single chip delivers 1PetaOps/sec

Get Electronics Weekly every day

Leave a Reply Cancel reply