Etched is making waves in the artificial intelligence hardware space with its revolutionary new AI accelerator chip. The Silicon Valley startup, founded in 2022 by Harvard dropouts Gavin Uberti and Chris Zhu, has developed a custom application-specific integrated circuit (ASIC) called Sohu that is purpose-built to run transformer models – the architecture behind today’s most advanced AI systems.
Etched transformer ASICS for LLMs
Etched claims its Sohu chip can process AI workloads up to 20 times faster than Nvidia’s top-of-the-line GPUs while using significantly less power. With $120 million in fresh funding and partnerships with major cloud providers, Etched is positioning itself as a formidable challenger to Nvidia’s dominance in AI chips.
Primary Venture Partners and Positive Sum Ventures led the funding round, which included participation from high-profile investors like Peter Thiel, Github CEO Thomas Dohmke, and former Coinbase CTO Balaji Srinivasan. As transformer models continue to drive breakthroughs in generative AI, Etched’s specialized hardware could reshape the landscape of AI computing.
Etched’s approach targets the complexities of GPUs and TPUs, particularly the need to handle arbitrary CUDA and PyTorch code, which demands sophisticated compilers. While other AI chip developers like AMD, Intel, and AWS have invested billions into software development with limited success, Etched is narrowing its focus. By exclusively running transformers, Etched can streamline software development for these models.
Most AI companies use transformer-specific inference libraries such as TensorRT-LLM, vLLM, or HuggingFace’s TGI. Although somewhat inflexible, these frameworks suffice for most needs because transformer models across different applications—text, image, or video—are fundamentally similar. This allows users to adjust model hyperparameters without altering the core model code. However, the most prominent AI labs often require custom solutions, employing engineers to optimize GPU kernels meticulously.
Etched aims to eliminate the need for reverse engineering by making its entire software stack open source, from drivers to kernels. This openness allows engineers to implement custom transformer layers as needed, enhancing flexibility and innovation.
Etched’s approach to AI hardware is comparable to the advancements seen with Groq’s LPU Inference Engine. Groq’s LPU, a dedicated Language Processing Unit, has set new benchmarks in processing efficiency for large language models, surpassing traditional GPUs in specific tasks. According to ArtificialAnalysis.ai, Groq’s LPU achieved a throughput of 241 tokens per second with Meta AI’s Llama 2-70b model, demonstrating its capability to process large volumes of more straightforward data more efficiently than other solutions.
This level of performance spotlights the potential for specialized AI hardware to revolutionize the field by offering faster and more efficient processing capabilities tailored to specific AI workloads. Etched claims its ASIC achieves as many as 500,000 tokens per token with its hardware, dwarfing Groq’s performance.
ASICs changed the game for Bitcoin; will they do the same for AI?
The introduction of ASICs for Bitcoin mining marked a revolutionary shift in the landscape, fundamentally altering the network dynamics. When ASICs were first introduced in 2013, they represented a quantum leap in mining efficiency compared to the CPUs and GPUs that had previously dominated the field. This transition profoundly impacted Bitcoin’s ecosystem, dramatically increasing the network’s overall hash rate and, consequently, its security.
ASICs, being purpose-built for Bitcoin mining, offered unprecedented computational power and energy efficiency, quickly rendering CPU and GPU mining obsolete for Bitcoin. This shift led to a rapid centralization of mining power, as only those with access to ASIC hardware could profitably mine Bitcoin. The ASIC era ushered in industrial-scale mining operations, transforming Bitcoin mining from a hobby accessible to individual enthusiasts into a highly competitive, capital-intensive industry.
Etched history and development
Etched’s vision began in 2022 when AI technologies like ChatGPT were not yet prevalent, and image and video generation models primarily relied on U-Nets and CNNs. Since then, transformers have become the dominant architecture across various AI domains, validating Etched’s strategic focus.
The company is rapidly advancing toward one of the quickest chip launches in history. It has attracted top talent from major AI chip projects, partnered with TSMC for their advanced 4nm process, and secured essential resources such as HBM and server supply to support initial production. Early customers have already committed tens of millions of dollars to Etched’s hardware.
This rapid progress could dramatically accelerate AI capabilities. For instance, AI models could become 20 times faster and cheaper overnight. Current limitations could be drastically reduced, such as the slow response times of models like Gemini or the high costs and long processing times of coding agents. Real-time applications, from video generation to AI-driven conversations, could become feasible, addressing the current bottlenecks faced even by leading AI firms like OpenAI during peak usage periods.
Etched’s advancements promise to make real-time video, calls, agents, and search a reality, fundamentally transforming AI capabilities and their integration into everyday applications.