The world's largest chip, wants to achieve 4000 times improvement with light interconnection
- Categories:News
- Author:
- Origin:
- Time of issue:2024-06-28 15:33
- Views:
(Summary description)Cerebras, a developer of wafer-scale processors, is working on an optical subroutine to improve its system performance by 4,000 times and is calling for industry collaboration and standardization. The current Cerebras WSE3 processor is built on a 300mm wafer with 9 million transistors and a power consumption of 20kW. The California-based company has had to develop its own wafer-scale packages for I/O, power delivery and cooling, and is currently working on optical interconnects. The company's chief systems architect spoke at Leti Innovation Day in Grenoble, France, this week, exploring how to address scalability challenges with small chips and 3D heterogeneous packaging technology. "It's not a small chip, but it's still a candidate for 3D integration," said JP Fricker, Cerebras co-founder and chief system architect. "This technology will be transformative." However, a key limitation for performance, scalability, and power consumption is off-chip I/O. "I/O is a limitation of large computing that prevents you from getting into very large systems. These technologies exist today, but we need to invent technologies to put them together. We are developing these technologies and our goal is to build supercomputers that are 4,000 times faster than today and connect 1,000 wafers together." "Currently, I/O is located on both edges of the chip, but it works better if I/O is distributed across the chip. Shortening the channel length reduces the size of the SERDES, saving space and power." "We want to have a lot of optical engines," he said. "Right now they're external, but eventually we'll put these lasers in a chip." These will be used for multiple communication channels with reasonable data rates of 100 to 200Gbit/s, rather than thick pipes, he said. "We have our own wafer-level engine and take third-party wafer-level programmable optical interconnects and put them together, using the entire surface of the wafer to connect to the wafer," he said. "It requires a heterogeneous wafer-to-wafer package." Companies such as Celestia AI and LightMatter have been developing these optical interconnect technologies, especially for hyperscale and AI chip companies. "But we need to invent or repurpose technology. The current interconnect pitch is too thick and we can't get fabs willing to integrate the technology because it's too niche, so we need to create a different process. Hybrid bonding enables finer pitch and higher assembly yields below 12 microns, but it is only available in specific fabs, and there are limited process pairs in fabs, such as 5nm to 5nm wafers, but different foundries cannot be used, and this is also true after two years." There are also challenges in the process steps. "For hybrid bonding, the fab stops at one of the last copper layers, which is not easy to detect, but that makes shipping to another fab difficult." "We want to develop a new technology to standardize the surface treatment of wafers through a common top layer, and use this layer as a standard interface for wafer stacking, so that different wafers can be manufactured in different ways, but the last set of interfaces is common for bonding between different factories." It also means that bonding can be done by a third party, not just a high-volume factory, "he said. The marks left by the test probe on the copper layer are also a problem for flattening, and these marks must be removed or a non-contact test system used. But he says it has significant advantages. "We can transmit power through optical wafers because the elements are more sparse, there are many through-silicon holes (TSVS) and very short channels, and the elements are located in a single layer by using multiple wavelengths." This makes it possible to transmit power from the top and remove cooling from the bottom in the same system." "In our case, the network on the compute wafer is based on a configurable structure that is set up before the workload is run on the wafer. When you do this with circuit switching in the optical domain, you can evolve electrical switching into the optical domain, but you don't need to do it very often. Cross the moat of Nvidia How wide is Nvidia's moat? That's the $3 trillion question on investors' minds today. At least part of the answer may come later this year in the form of an IPO. Cerebras Systems, an AI startup that is trying to challenge Nvidia on the AI chip battlefield, is set for an initial public offering by the end of 2024. Lior Susan, founder and managing partner of Eclipse Ventures, first invested in Cerebras in 2015, when the company had five presentation slides and theoretical plans for a new computer architecture. Eight years later, the startup offers special large chips with lots of memory for generative AI workloads like model training and reasoning. These are up against Nvidia chips, including the B100 and H100. The most "annoying" thing about competing with Nvidia is CUDA but according to Susan, CUDA is a soft
The world's largest chip, wants to achieve 4000 times improvement with light interconnection
(Summary description)Cerebras, a developer of wafer-scale processors, is working on an optical subroutine to improve its system performance by 4,000 times and is calling for industry collaboration and standardization.
The current Cerebras WSE3 processor is built on a 300mm wafer with 9 million transistors and a power consumption of 20kW. The California-based company has had to develop its own wafer-scale packages for I/O, power delivery and cooling, and is currently working on optical interconnects.
The company's chief systems architect spoke at Leti Innovation Day in Grenoble, France, this week, exploring how to address scalability challenges with small chips and 3D heterogeneous packaging technology. "It's not a small chip, but it's still a candidate for 3D integration," said JP Fricker, Cerebras co-founder and chief system architect. "This technology will be transformative."
However, a key limitation for performance, scalability, and power consumption is off-chip I/O.
"I/O is a limitation of large computing that prevents you from getting into very large systems. These technologies exist today, but we need to invent technologies to put them together. We are developing these technologies and our goal is to build supercomputers that are 4,000 times faster than today and connect 1,000 wafers together."
"Currently, I/O is located on both edges of the chip, but it works better if I/O is distributed across the chip. Shortening the channel length reduces the size of the SERDES, saving space and power."
"We want to have a lot of optical engines," he said. "Right now they're external, but eventually we'll put these lasers in a chip." These will be used for multiple communication channels with reasonable data rates of 100 to 200Gbit/s, rather than thick pipes, he said.
"We have our own wafer-level engine and take third-party wafer-level programmable optical interconnects and put them together, using the entire surface of the wafer to connect to the wafer," he said. "It requires a heterogeneous wafer-to-wafer package."
Companies such as Celestia AI and LightMatter have been developing these optical interconnect technologies, especially for hyperscale and AI chip companies.
"But we need to invent or repurpose technology. The current interconnect pitch is too thick and we can't get fabs willing to integrate the technology because it's too niche, so we need to create a different process. Hybrid bonding enables finer pitch and higher assembly yields below 12 microns, but it is only available in specific fabs, and there are limited process pairs in fabs, such as 5nm to 5nm wafers, but different foundries cannot be used, and this is also true after two years."
There are also challenges in the process steps.
"For hybrid bonding, the fab stops at one of the last copper layers, which is not easy to detect, but that makes shipping to another fab difficult."
"We want to develop a new technology to standardize the surface treatment of wafers through a common top layer, and use this layer as a standard interface for wafer stacking, so that different wafers can be manufactured in different ways, but the last set of interfaces is common for bonding between different factories." It also means that bonding can be done by a third party, not just a high-volume factory, "he said.
The marks left by the test probe on the copper layer are also a problem for flattening, and these marks must be removed or a non-contact test system used.
But he says it has significant advantages.
"We can transmit power through optical wafers because the elements are more sparse, there are many through-silicon holes (TSVS) and very short channels, and the elements are located in a single layer by using multiple wavelengths." This makes it possible to transmit power from the top and remove cooling from the bottom in the same system."
"In our case, the network on the compute wafer is based on a configurable structure that is set up before the workload is run on the wafer. When you do this with circuit switching in the optical domain, you can evolve electrical switching into the optical domain, but you don't need to do it very often.
Cross the moat of Nvidia
How wide is Nvidia's moat? That's the $3 trillion question on investors' minds today. At least part of the answer may come later this year in the form of an IPO. Cerebras Systems, an AI startup that is trying to challenge Nvidia on the AI chip battlefield, is set for an initial public offering by the end of 2024.
Lior Susan, founder and managing partner of Eclipse Ventures, first invested in Cerebras in 2015, when the company had five presentation slides and theoretical plans for a new computer architecture. Eight years later, the startup offers special large chips with lots of memory for generative AI workloads like model training and reasoning. These are up against Nvidia chips, including the B100 and H100.
The most "annoying" thing about competing with Nvidia is CUDA but according to Susan,
CUDA is a soft
- Categories:News
- Author:
- Origin:
- Time of issue:2024-06-28 15:33
- Views:
Cerebras, a developer of wafer-scale processors, is working on an optical subroutine to improve its system performance by 4,000 times and is calling for industry collaboration and standardization.
The current Cerebras WSE3 processor is built on a 300mm wafer with 9 million transistors and a power consumption of 20kW. The California-based company has had to develop its own wafer-scale packages for I/O, power delivery and cooling, and is currently working on optical interconnects.
The company's chief systems architect spoke at Leti Innovation Day in Grenoble, France, this week, exploring how to address scalability challenges with small chips and 3D heterogeneous packaging technology. "It's not a small chip, but it's still a candidate for 3D integration," said JP Fricker, Cerebras co-founder and chief system architect. "This technology will be transformative."
However, a key limitation for performance, scalability, and power consumption is off-chip I/O.
"I/O is a limitation of large computing that prevents you from getting into very large systems. These technologies exist today, but we need to invent technologies to put them together. We are developing these technologies and our goal is to build supercomputers that are 4,000 times faster than today and connect 1,000 wafers together."
"Currently, I/O is located on both edges of the chip, but it works better if I/O is distributed across the chip. Shortening the channel length reduces the size of the SERDES, saving space and power."
"We want to have a lot of optical engines," he said. "Right now they're external, but eventually we'll put these lasers in a chip." These will be used for multiple communication channels with reasonable data rates of 100 to 200Gbit/s, rather than thick pipes, he said.
"We have our own wafer-level engine and take third-party wafer-level programmable optical interconnects and put them together, using the entire surface of the wafer to connect to the wafer," he said. "It requires a heterogeneous wafer-to-wafer package."
Companies such as Celestia AI and LightMatter have been developing these optical interconnect technologies, especially for hyperscale and AI chip companies.
"But we need to invent or repurpose technology. The current interconnect pitch is too thick and we can't get fabs willing to integrate the technology because it's too niche, so we need to create a different process. Hybrid bonding enables finer pitch and higher assembly yields below 12 microns, but it is only available in specific fabs, and there are limited process pairs in fabs, such as 5nm to 5nm wafers, but different foundries cannot be used, and this is also true after two years."
There are also challenges in the process steps.
"For hybrid bonding, the fab stops at one of the last copper layers, which is not easy to detect, but that makes shipping to another fab difficult."
"We want to develop a new technology to standardize the surface treatment of wafers through a common top layer, and use this layer as a standard interface for wafer stacking, so that different wafers can be manufactured in different ways, but the last set of interfaces is common for bonding between different factories." It also means that bonding can be done by a third party, not just a high-volume factory, "he said.
The marks left by the test probe on the copper layer are also a problem for flattening, and these marks must be removed or a non-contact test system used.
But he says it has significant advantages.
"We can transmit power through optical wafers because the elements are more sparse, there are many through-silicon holes (TSVS) and very short channels, and the elements are located in a single layer by using multiple wavelengths." This makes it possible to transmit power from the top and remove cooling from the bottom in the same system."
"In our case, the network on the compute wafer is based on a configurable structure that is set up before the workload is run on the wafer. When you do this with circuit switching in the optical domain, you can evolve electrical switching into the optical domain, but you don't need to do it very often.
Cross the moat of Nvidia
How wide is Nvidia's moat? That's the $3 trillion question on investors' minds today. At least part of the answer may come later this year in the form of an IPO. Cerebras Systems, an AI startup that is trying to challenge Nvidia on the AI chip battlefield, is set for an initial public offering by the end of 2024.
Lior Susan, founder and managing partner of Eclipse Ventures, first invested in Cerebras in 2015, when the company had five presentation slides and theoretical plans for a new computer architecture. Eight years later, the startup offers special large chips with lots of memory for generative AI workloads like model training and reasoning. These are up against Nvidia chips, including the B100 and H100.
The most "annoying" thing about competing with Nvidia is CUDA but according to Susan,
CUDA is a software layer built by Nvidia to help developers work with and guide their graphics processing units. The platform has millions of lines of code that save developers time and money, and at this point it is the default code for much of the AI ecosystem.
Cerebras has its own software that works with the startup's chips. But even well-designed alternatives are years behind CUDA. As the knowledge and habits of developers build, this lead will be hard to break.
"I personally completely underestimated the CUDA part of selling chips," Susan said. "You're after the hardware. You stayed because of the software. He added: "As technologists, we always like to cry and say we don't like something, but then we keep using it. Because it doesn't get any better than this.
Umesh Padval, a semiconductor industry veteran and managing director at Thomvest, calls CUDA a fortress. It was slow to take off in 2007, but has snowballed in recent years, with about 5 million developers writing CUDA code, adding usable data to the specification, weeding out bugs, and supporting each other.
Over time, Nvidia has piled more tools and assets on top of CUDA. Things like word banks and training data that startups can tap into so they don't start from scratch every time they decide to direct the power of the AI revolution to a new use case. Modulus, for example, is a library that helps AI understand physics.
"He now has millions of software developers who know the language and have been using it for a long time," Padval said of Nvidia CEO Jen-Hsun Huang. "It will work out at some point, but it's a big moat."
Crossing this CUDA moat is key. An internal Amazon document obtained by Business Insider is one example. Amazon's Neuron software is designed to help developers build AI tools using AWS's AI chips, but the current setup "prevents migration from NVIDIA CUDA," the filing said. This is one of the main factors preventing some AWS customers from using Amazon's AI chips.
Anyone building an alternative to CUDA can't completely forget about it. If startups build technology from scratch with this intent, they can try to avoid this altogether. However, most AI developers will have to change their software through hard work if they want to change their hardware.
AMD is Nvidia's most direct competitor, with its own platform called ROCm. It comes with a tool called Hipify that converts CUDA software code into something more portable.
"You write a program in CUDA. To run it on AMD Gpus, you can switch using the Hipify tool, "said Thomas Sohmers, co-founder and CEO of AI chip startup Positron. "Frankly, I haven't seen anyone using Hipify.
In September, a group of AI luminaries and Nvidia rivals including Qualcomm, Google and Intel formed the UXL Foundation to build a rival software platform that has nothing to do with chips. The software will not be available until the end of 2024. Other attempts to breach the CUDA fortress have failed.
However, time and inertia are a powerful combination, as KG Ganapathi is happy to explain.
His startup, Vimaan, is developing technology for the "dark warehouse" of the future that doesn't require a human operator. It uses computer vision and machine learning to understand and catalog the shape, size, number, and location of each item. Vimaan has received funding from the Amazon Industrial Innovation Fund and is part of Nvidia's Inception program.
The Vimaan team has built the entire system in CUDA, and Ganapathi has no interest in changing it now, even though there are clear reasons to do so.
"Do I take this opportunity to switch the entire infrastructure that we built on the Nvidia platform?" 'he said. "Probably not."
Still, Thomvest's Padval believes Nvidia customers want to mitigate risk by diversifying their sources of Gpus and supporting software. That means competitors will still be driven by capital as well as product testing and buying.
"Customers love leaders. But they also feel they want a second source so they have a choice, "he said.
Since CUDA is probably the most important element in Nvidia's moat, long-term projections for the company's market share can indicate just how big that influence is.
Eclipse Ventures' Susan said the market is so large that even taking a small portion away from Nvidia is worth it.
"I said hey, my biggest competitor is worth $3.5 trillion, so you know, if I get 10 percent, I'm a happy person," he said.
Tel / Fax : +86-760-87563218
E-mail: sale05@publicxin.com
Add: 21 caoxing East Road, Guzhen Town, Zhongshan City, Guangdong
Add: 4b203, 4th floor, South China International Textile clothing raw materials logistics zone (phase II), No.1, South China Avenue, Pinghu street, Longgang District, Shenzhen
Copyright © 2021
Shenzhen Zhonghexin Optoelectronic Technology Co., Ltd.