Tachyum Demonstrates Machine Check and Recovery on Prodigy FPGA
13 March 2024 - 3:32AM
Business Wire
Tachyum® today announced the addition of Machine Check and
Recovery (MCR) capabilities with the Linux Error Detection and
Correction (EDAC) subsystem to the Prodigy® Universal Processor
with successful deployment demonstrated as part of the FPGA
emulation system.
MCR with Linux EDAC driver is essential for data center
applications, with the pair working together to provide critical
information to predict and mitigate failures in the field. By
detecting and seamlessly correcting errors caused by external
events in the CPU’s internal memory blocks and attached DDR
modules, Prodigy can run prolonged workflows without interruption
to maintain and improve uptime of systems deployed at scale. When
the degree of Static Random-Access Memory (SRAM) damage is beyond
repair, the error detection allows affected computations to be
abandoned rather than provide incorrect results.
Error injection is an essential part of testing. Prodigy
contains an error-injection module that can inject both correctable
and uncorrectable errors into relevant CPU blocks and either a
limited number or continuous stream of errors with programmable
intervals to ensure the Prodigy architecture meets and exceeds data
center requirements. Prodigy provides Double Error Correction and
Triple Error Detection (DECTED), which is a key feature to
improving uptime, and is complemented by EDAC to enable
preventative maintenance.
“Today’s demanding data center applications require a level of
reliability and availability previously unseen in order to complete
complex functions while mitigating errors,” said Dr. Radoslav
Danilak, founder and CEO of Tachyum. “Organizations choosing to
deploy Prodigy-enabled datacenters will be able to rest comfortably
knowing that we have fortified their system by fully integrating
and testing the MCR system with the Linux EDAC driver as part of
our FPGA emulator, which will ensure optimal performance when the
processor is commercially available in the near future.”
As a Universal Processor offering industry-leading performance
for all workloads, Prodigy-powered data center servers can
seamlessly and dynamically switch between computational domains
(such as AI/ML, HPC, and cloud) with a single homogeneous
architecture. By eliminating the need for expensive dedicated AI
hardware and dramatically increasing server utilization, Prodigy
reduces CAPEX and OPEX significantly while delivering unprecedented
data center performance, power, and economics. Prodigy integrates
192 high-performance custom-designed 64-bit compute cores, to
deliver up to 4.5x the performance of the highest-performing x86
processors for cloud workloads, up to 3x that of the highest
performing GPU for HPC, and 6x for AI applications.
A video demonstrating Prodigy’s ability to correct memory errors
caused by its error-injection modules and how the EDAC subsystem
reacts to those events is available for viewing at
https://youtu.be/N0f-E-pnP-M.
Follow Tachyum
https://twitter.com/tachyum
https://www.linkedin.com/company/tachyum
https://www.facebook.com/Tachyum/
About Tachyum
Tachyum is transforming the economics of AI, HPC, public and
private cloud workloads with Prodigy, the world’s first Universal
Processor. Prodigy unifies the functionality of a CPU, a GPU, and a
TPU in a single processor to deliver industry-leading performance,
cost and power efficiency for both specialty and general-purpose
computing. As global data center emissions continue to contribute
to a changing climate, with projections of their consuming 10
percent of the world’s electricity by 2030, the ultra-low power
Prodigy is positioned to help balance the world’s appetite for
computing at a lower environmental cost. Tachyum recently received
a major purchase order from a US company to build a large-scale
system that can deliver more than 50 exaflops performance, which
will exponentially exceed the computational capabilities of the
fastest inference or generative AI supercomputers available
anywhere in the world today. When complete in 2025, the
Prodigy-powered system will deliver a 25x multiplier vs. the
world’s fastest conventional supercomputer – built just this year –
and will achieve AI capabilities 25,000x larger than models for
ChatGPT4. Tachyum has offices in the United States and Slovakia.
For more information, visit https://www.tachyum.com/.
View source
version on businesswire.com: https://www.businesswire.com/news/home/20240312050373/en/
Mark Smith JPR Communications 818-398-1424 marks@jprcom.com