Teng Yan
October 20, 2024

Manufacturing Intelligence At Scale

Key takeaways from Jensen Huang's rare candid interview this week.

GM!

The Weekender is a casual Sunday series in which I break down the most interesting AI podcasts and videos I consume each week and distill their key takeaways.

They include a mix of insights, synthesis, and my own thoughts, organized to be as useful as possible.

We love Kungfu Jensen!

This weekend, we dive into Jensen Huang’s rare, candid interview with BG2, offering a glimpse into how the world’s second-largest company's leader is thinking about AI's future.

I also caught the post-show recap with Bill Gurley and Sunny and pulled in some additional insights from their conversation.

Let’s kick things off with my favourite quote from Jensen:

“Intelligence is the single most valuable commodity the world has ever known, and now we are going to manufacture it at scale”

Powerful. And I couldn’t agree more.

Computing

  • AI is transforming faster than ever due to the reinvention of computing, driving down costs by orders of magnitude—a “Super Moore’s Law.”
  • The data centre is now the true unit of computing, not individual chips. Success lies in large-scale deployments and tightly integrated systems.
  • GPUs are improving yearly in performance, cost, and energy efficiency, and soon, all software applications will become deeply machine-learned.
  • Why GPU demand will only grow:
    • Current data centres ($1T worth) need to be modernised for the future
    • Software will undergo a significant transformation with the rise of AI agents & AI factories, an emerging space likely worth trillions.

Training

  • Training is inference at scale: if you train well, you’ll likely inference well
  • Decentralised training is critical. Async distributed training will unlock the potential for scaling AI by utilizing millions of GPUs, which NVIDIA’s early work on parallelism has enabled.

Inference

  • The emphasis in scaling AI has moved from pre-training to post-training and inference.
  • To optimise for time-to-first-token, huge amounts of bandwidth and FLOPs (compute power) are necessary. Blackwell + NVLink is designed for this.
  • Many aspects of intelligence can’t be determined beforehand; much of it happens in real-time. Sometimes we want our answers in a week, sometimes immediately. There will be segmentation of intelligence; imagine having an intelligence layer that routes to the right models
  • 40% of NVIDIA’s revenue is on inference today: Jensen expects that inference demand will go up “a billion times”. Jensen said older chips can be used to meet the inference demand. The hosts were not fully convinced, however.
  • Companies like Groq and Cerebras are currently leading in inference performance, showing that CUDA’s dominance might not extend fully into this space.

AI in general

  • A model is not AI, it is just a component of AI.
  • If you understand the AI stack's taxonomy, opportunities will be everywhere. Be able to distinguish between a feature vs product vs company.
  • The future of AI will involve both open and closed models. Open models are essential for advancing research across various fields, while closed models allow companies to monetize.
  • Will we see AI agents with memory and action capabilities (e.g., booking hotels autonomously) within the next two years? Technically, it’s already possible. The real challenge lies in scaling these systems to handle edge cases and minimize hallucinations.
  • AI is set to transform every job, unlocking a new wave of human productivity—continuing two centuries of technological progress and automation.
  • Different AIs and Q&A back and forth with different knowledge bases lead to synthetic data generation. NVIDIA’s Nemotron model is best-in-class for reward systems.

NVIDIA

  • NVIDIA’s edge lies in its full-stack approach, from data ingestion to post-training optimization. This enables deep integrations with cloud providers and partners.
  • CUDA plays a central role in NVIDIA’s strategy. However, questions remain about CUDA’s long-term relevance, especially for inference workloads where optimization might shift to a higher layer, e.g. PyTorch.
  • NVIDIA’s advantage grows if AI expands beyond transformer models and the architecture landscape becomes more heterogeneous. If not, ASICs could dominate inference workloads.
  • NVIDIA’s competitive advantage gets bigger as the network/system gets larger.
  • Machine learning involves optimizing the entire data pipeline, including smarter AI to curate data. You need to accelerate every single step in the pipeline to keep the ML flywheel moving quickly and materially improve it
  • NVIDIA aims to 3X their revenue with only a 25% increase in headcount (50,000 employees), by leveraging on 100M+ specialists AI agents in the future

And a final Jensen quote to end with: “Be a market maker, not a share taker.”

That’s it! Enjoy the rest of the weekend, folks.

Cheers,

Teng Yan

Others You May Like