NVIDIA’s founder/CEO Jensen Huang gave his keynote address at SIGGRAPH 2023, the most important conference in computer graphics. NVIDIA’s flagship product is the GPU. Jensen says, “Graphics and AI are inseparable.” Yet, we will continue to see reports in major news outlets that Meta made a losing bet on the metaverse. At the same time, we’re seeing major AI developments coming from Meta. It’s as if the public hasn’t yet made the connection between AI and computer graphics. Zuckerberg’s vision for the metaverse is all tied into the evolution of computer graphics. The large language model that Jensen constantly references in his keynote is not GPT but Llama, the model developed by Meta.
I always enjoy Jensen’s talks. Even though this wasn’t one of his best keynotes, the processing capacity of NVIDIA’s GPUs are powering the breakthroughs in artificial intelligence. Jensen outlines a vision as to how NVIDIA’s momentum will continue throughout this decade. 1
As a hardware company, Jensen awkwardly put out the sales pitch for new graphics cards and lowering the cost of data centers by investing in NVIDIA products. Jensen is a better public speaker when he talks about how GPUs are enabling large language models to function as a new computing platform and that the costs of incorporating large language models into software applications will drop significantly with advances in compute costs.
Big Idea
“The single most valuable thing we do as humanity is to generate intelligent information.” Jensen Huang, NVIDIA keynote, SIGGRAPH 2023.
In discussions as to what separate humans from other species, the conversations tend to swerve toward vague notions of consciousness. Without going down that rabbit hole at this moment (and also ignoring the vast libraries that owls and cats are organizing in fantasy fiction), let’s assume that actionable intelligence based on shared information is core to our lives.
Essential Question
How might AI improve the way we create and share information in our work, in our daily lives, and in our creativity?
The Ubiquity of Large Language Models
Within a few years, every human-computer interaction will be enhanced through a large language model. The discipline of User Experience (UX) design is an important factor. As UX design seamlessly integrates AI enhancements, we will begin to perceive AI not as a separate entity but as an intrinsic part of our digital experiences.
Jensen states that, “the canonical use case of the future is a large language model on the frontend of just about everything, every single application, every single database, whenever you interact with a computer, you’ll likely be engaging a large language model. That large language model will figure out what you’re trying to, what are your intentions considering the context, and present the information to you in the best possible way.”
Anyone who doubts the future need for programmers doesn’t realize the amount of work that goes into connecting distributed systems to create user-friendly applications. The tasks of software developers are changing but that’s always has been true. Now, developers have more tools, thanks to AI, for doing the job.
Software developers are busy integrating large language models into applications by using vendor-provided APIs, such as those provided by OpenAI and Microsoft. The LangChain project is a popular framework though developers may also prefer to code together the components without the aid of a framework.
In his keynote, Jensen describes how the output of a large language model (along with augmented data from a vector database) could be used as a guide for a generative model. He specifically mentioned SDXL from stability.ai.
With large language models, it’s easy to focus only on text. Remember that NVIDIA is about graphics and Jensen’s keynote was at the premier conference for scientific research into computer graphics. We live in a visual world. Even for those who are blind, like my late stepfather, the world consists of physical objects. A fundamental aspect of 3D computer graphics is that 3D objects have properties of physics.
Virtual Worlds are 3D
Realism in virtual worlds is achieved through 3D graphics exhibiting physics-based properties. Objects should not only look real but also behave in ways consistent with the real world. For instance, when light shines on a surface, the object should reflect, refract, or absorb the light based on the material’s properties. By following the principles of physics, simulations and animations become more convincing, whether it’s in video games, movies, or VR. NVIDIA’s GPUs have enabled the computationally intensive procedures of emulating real-world physics in virtual worlds.
Jensen’s keynote transitions us from thinking of large language model as merely textual devices to also serving as a frontend to a 3d database. NVIDIA’S Omniverse is being positioned as a tool for connecting the different parts of the 3D pipeline.
I’m wrapping this post up with a brief mention of Universal Scene Description (USD). Later, I’ll provide a deep dive into the important role of USD in building the metaverse. In his keynote, Jensen stated, “OpenUSD is a very big deal...OpenUSD is a framework, a universal interchange, for creating 3D worlds…for describing, compositing, simulating, and collaborating on 3D projects across tools.”
NVIDIA’s valuation reached $1 trillion USD earlier this year. NVIDIA’s stock price has increased nearly 600% over the last 5 years.