Andrew Ng

Co-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs

Andrew Ng
Oct 15, 4:55 PM
Learn to build your own voice-activated AI assistant that can execute tasks like gathering recent AI news from the web, scripting out a podcast, and using tools to put all that into a multi-speaker podcast. See our new short course: "Building Live Voice Agents with Google’s ADK (Agent Development Kit),” taught by Google’s @lavinigam and @sitalakshmi_s. ADK provides modular components that make it easy to build and debug agents. It also includes a built-in web interface for tracing agentic reasoning. This course illustrates these concepts via building a live voice agent that can chain actions to complete a complex task like creating a podcast. This requires maintaining context, implementing guardrails, reasoning, and handling audio streaming, while keeping latency low. You’ll learn to: - Build voice agents that listen, reason, and respond - Guide your agent to follow a specific workflow to accomplish a task - Coordinate specialized agents to build an agentic podcast workflow that researches topics and produces multi-speaker audio - Understand how to deploy an agent into production Even if you’re not yet building voice systems, you'll find understanding how realtime agents stream data and maintain reliability useful for designing modern agentic applications. Please join here: https://t.co/OcBbPHsQIH
Andrew Ng
Oct 7, 5:29 PM
Announcing my new course: Agentic AI! Building AI agents is one of the most in-demand skills in the job market. This course, available now at https://t.co/zGHUh1loPO, teaches you how. You'll learn to implement four key agentic design patterns: - Reflection, in which an agent examines its own output and figures out how to improve it - Tool use, in which an LLM-driven application decides which functions to call to carry out web search, access calendars, send email, write code, etc. - Planning, where you'll use an LLM to decide how to break down a task into sub-tasks for execution, and - Multi-agent collaboration, in which you build multiple specialized agents — much like how a company might hire multiple employees — to perform a complex task You'll also learn to take a complex application and systematically decompose it into a sequence of tasks to implement using these design patterns. But here's what I think is the most important part of this course: Having worked with many teams on AI agents, I've found that the single biggest predictor of whether someone executes well is their ability to drive a disciplined process for evals and error analysis. In this course, you'll learn how to do this, so you can efficiently home in on which components to improve in a complex agentic workflow. Instead of guessing what to work on, you'll let evals data guide you. This will put you significantly ahead of the game compared to the vast majority of teams building agents. Together, we'll build a deep research agent that searches, synthesizes, and reports, using all of these agentic design patterns and best practices. This self-paced course is taught in a vendor neutral way, using raw Python - without hiding details in a framework. You'll see how each step works, and learn the core concepts that you can then implement using any popular agentic AI framework, or using no framework. The only prerequisite is familiarity with Python, though knowing a bit about LLMs helps. Come join me, and let's build some agentic AI systems! Sign up to get started: https://t.co/FX35dloqw4
Andrew Ng
Sep 30, 6:19 PM
Announcing a significant upgrade to Agentic Document Extraction! LandingAI's new DPT (Document Pre-trained Transformer) accurately extracts even from complex docs. For example, from large, complex tables, which is important for many finance and healthcare applications. And a new SDK makes using it require only 3 simple lines of code. Please see the video for technical details. I hope this unlocks a lot of value from the "dark data" currently stuck in PDF files, and that you'll build something cool with this!
Andrew Ng
Sep 25, 8:33 PM
Last week, China barred its major tech companies from buying Nvidia chips. This move received only modest attention in the media, but has implications beyond what’s widely appreciated. Specifically, it signals that China has progressed sufficiently in semiconductors to break away from dependence on advanced chips designed in the U.S., the vast majority of which are manufactured in Taiwan. It also highlights the U.S. vulnerability to possible disruptions in Taiwan at a moment when China is becoming less vulnerable. After the U.S. started restricting AI chip sales to China, China dramatically ramped up its semiconductor research and investment to move toward self-sufficiency. These efforts are starting to bear fruit, and China’s willingness to cut off Nvidia is a strong sign of its faith in its domestic capabilities. For example, the new DeepSeek-R1-Safe model was trained on 1000 Huawei Ascend chips. While individual Ascend chips are significantly less powerful than individual Nvidia or AMD chips, Huawei’s system-level design approach to orchestrating how a much larger number of chips work together seems to be paying off. For example, Huawei’s CloudMatrix 384 system of 384 chips aims to compete with Nvidia’s GB200, which uses 72 higher-capability chips. Today, U.S. access to advanced semiconductors is heavily dependent on Taiwan’s TSMC, which manufactures the vast majority of the most advanced chips. Unfortunately, U.S. efforts to ramp up domestic semiconductor manufacturing have been slow. I am encouraged that one fab at the TSMC Arizona facility is now operating, but issues of workforce training, culture, licensing and permitting, and the supply chain are still being addressed, and there is still a long road ahead for the U.S. facility to be a viable substitute for manufacturing in Taiwan. If China gains independence from Taiwan manufacturing significantly faster than the U.S., this would leave the U.S. much more vulnerable to possible disruptions in Taiwan, whether through natural disasters or man-made events. If manufacturing in Taiwan is disrupted for any reason and Chinese companies end up accounting for a large fraction of global semiconductor manufacturing capabilities, that would also help China gain tremendous geopolitical influence. Despite occasional moments of heightened tensions and large-scale military exercises, Taiwan has been mostly peaceful since the 1960s. This peace has helped the people of Taiwan to prosper and allowed AI to make tremendous advances, built on top of chips made by TSMC. I hope we will find a path to maintaining peace for many decades more. But hope is not a plan. In addition to working to ensure peace, practical work lies ahead to multi-source, build more chip fabs in more nations, and enhance the resilience of the semiconductor supply chain. Dependence on any single manufacturer invites shortages, price spikes, and stalled innovation the moment something goes sideways. [Original text: https://t.co/5bdEpQcaob ]
Andrew Ng
Sep 24, 5:15 PM
When data agents fail, they often fail silently - giving confident-sounding answers that are wrong, and it can be hard to figure out what caused the failure. "Building and Evaluating Data Agents" is a new short course created with @Snowflake and taught by @datta_cs and @_jreini that teaches you to build data agents with comprehensive evaluation built in. Skills you'll gain: - Build reliable LLM data agents using the Goal-Plan-Action framework and runtime evaluations that catch failures mid-execution - Use OpenTelemetry tracing and evaluation infrastructure to diagnose exactly where agents fail and systematically improve performance - Orchestrate multi-step workflows across web search, SQL, and document retrieval in LangGraph-based agents The result: visibility into every step of your agent's reasoning, so if something breaks, you have a systematic approach to fix it. Sign up to get started: https://t.co/jGQQcU6X46
Andrew Ng
Sep 21, 9:10 PM
My heart goes out to all the families and individuals anxious over their futures following the abrupt and chaotic announcement of H-1B visa changes. America should be working to attract more skilled talent, not create uncertainly that turns them away. To all legal immigrants and H1-B holders: I support and appreciate you.
Andrew Ng
Sep 18, 4:13 PM
Automated software testing is growing in importance in the era of AI-assisted coding. Agentic coding systems accelerate development but are also unreliable. Agentic testing — where you ask AI to write tests and check your code against them — is helping. Automatically testing infrastructure software components that you intend to build on top of is especially helpful and results in more stable infrastructure and less downstream debugging. Software testing methodologies such as Test Driven Development (TDD), a test-intensive approach that involves first writing rigorous tests for correctness and only then making progress by writing code that passes those tests, are an important way to find bugs. But it can be a lot of work to write tests. (I personally never adopted TDD for that reason.) Because AI is quite good at writing tests, agentic testing enjoys growing attention. First, coding agents do misbehave! My teams use them a lot, and we have seen: - Numerous bugs introduced by coding agents, including subtle infrastructure bugs that take humans weeks to find. - A security loophole that was introduced into our production system when a coding agent made password resets easier to simplify development. - Reward hacking, where a coding agent modified test code to make it easier to pass the tests. - An agent running "rm *.py" in the working directory, leading to deletion of all of a project's code (which, fortunately, was backed up on github). In the last example, when pressed, the agent apologized and agreed “that was an incredibly stupid mistake.” This made us feel better, but the damage had already been done! I love coding agents despite such mistakes and see them making us dramatically more productive. To make them more reliable, I’ve found that prioritizing where to test helps. I rarely write (or direct an agent to write) extensive tests for front-end code. If there's a bug, hopefully it will be easy to see and also cause little lasting damage. For example, I find generated code’s front-end bugs, say in the display of information on a web page, relatively easy to find. When the front end of a web site looks wrong, you’ll see it immediately, and you can tell the agent and have it iterate to fix it. (A more advanced technique: Use MCP to let the agent integrate with software like Playwright to automatically take screenshots, so it can autonomously see if something is wrong and debug.) In contrast, back-end bugs are harder to find. I’ve seen subtle infrastructure bugs — for example, one that led to a corrupted database record only in certain corner cases — that took a long time to find. Putting in place rigorous tests for your infrastructure code might help spot these problems earlier and save you many hours of challenging debugging. Bugs in software components that you intend to build on top of lead to downstream bugs that can be hard to find. Further, bugs in a component that’s deep in a software stack — and that you build multiple abstraction layers on top of — might surface only weeks or months later, long after you’ve forgotten what you were doing while building this specific component, and be really hard to identify and fix. This is why testing components deep in your software stack is especially important. Meta’s mantra “Move fast with stable infrastructure” (which replaced “move fast and break things”) still applies today. Agentic testing can help you make sure you have good infrastructure for you and others to build on! At AI Fund and https://t.co/zpIxRSuky4’s recent Buildathon, we held a panel discussion with experts in agentic coding (Michele Catasta, President at Replit; Chao Peng, Principal Research Scientist at Trae; and Paxton Maeder-York, Venture Partnerships at Anthropic; moderated by AI Fund’s Eli Chen), where the speakers shared best practices. Testing was one of the topics discussed. That panel was one of my highlights of Buildathon and you can watch the video on YouTube. [Original text: https://t.co/B1sQ5oDnCU ]
Andrew Ng
Sep 17, 4:37 PM
New short course: Build AI Apps with MCP Servers: Working with Box Files, built with @Box and taught by @BenAtBox , their CTO. Many AI applications require custom code for basic file operations. The Model Context Protocol (MCP) standardizes this by letting you offload file tasks to dedicated servers that provide tools an LLM can use directly. In this course, you'll process documents stored in a Box folder using the Box MCP server. Rather than writing custom integration code to connect to the Box API and download files, you'll design your application to use the tools provided via MCP. Skills you'll gain: - Build an LLM-powered document processing app, using the Box MCP server to access files - Design a multi-agent system using Google's Agent Development Kit (ADK), consisting of specialized agents for file operations - Coordinate the multi-agent workflow through an orchestrator that uses the Agent2Agent (A2A) protocol to connect to the agents You'll start with a local file-processing app, refactor it to work with Box's MCP server, then evolve it into a multi-agent system. Sign up here: https://t.co/FitKgvGnpb
Andrew Ng
Sep 4, 3:54 PM
There is significant unmet demand for developers who understand AI. At the same time, because most universities have not yet adapted their curricula to the new reality of programming jobs being much more productive with AI tools, there is also an uptick in unemployment of recent CS graduates. When I interview AI engineers — people skilled at building AI applications — I look for people who can: - Use AI assistance to rapidly engineer software systems - Use AI building blocks like prompting, RAG, evals, agentic workflows, and machine learning to build applications - Prototype and iterate rapidly Someone with these skills can get a massively greater amount done than someone who writes code the way we did in 2022, before the advent of Generative AI. I talk to large businesses every week that would love to hire hundreds or more people with these skills, as well as startups that have great ideas but not enough engineers to build them. As more businesses adopt AI, I expect this talent shortage only to grow! At the same time, recent CS graduates face an increased unemployment rate, though the underemployment rate — of graduates doing work that doesn’t require a degree — is still lower than for most other majors. This is why we hear simultaneously anecdotes of unemployed CS graduates and also of rising salaries for in-demand AI engineers. When programming evolved from punchcards to keyboard and terminal, employers continued to hire punchcard programmers for a while. But eventually, all developers had to switch to the new way of coding. AI engineering is similarly creating a huge wave of change. There is a stereotype of “AI Native” fresh college graduates who outperform experienced developers. There is some truth to this. Multiple times, I have hired, for full-stack software engineering, a new grad who really knows AI over an experienced developer who still works 2022-style. But the best developers I know aren’t recent graduates (no offense to the fresh grads!). They are experienced developers who have been on top of changes in AI. The most productive programmers today deeply understand computers, how to architect software, and how to make complex tradeoffs — and who additionally are familiar with cutting-edge AI tools. Sure, some skills from 2022 are becoming obsolete. For example, a lot of coding syntax that we had to memorize back then is no longer important, since we no longer need to code by hand as much. But even if, say, 30% of CS knowledge is obsolete, the remaining 70% — complemented with modern AI knowledge — is what makes really productive developers. (Even after punch cards became obsolete, a fundamental understanding of programming was very helpful for typing code into a keyboard.) Without understanding how computers work, you can’t just “vibe code” your way to greatness. Fundamentals are still important, and for those who additionally understand AI, job opportunities are numerous! [Original text: https://t.co/nqzPC6eUpR ]
Andrew Ng
Aug 28, 5:25 PM
Parallel agents are emerging as an important new direction for scaling up AI. AI capabilities have scaled with more training data, training-time compute, and test-time compute. Having multiple agents run in parallel is growing as a technique to further scale and improve performance. We know from work at Baidu by my former team, and later OpenAI, that AI models’ performance scales predictably with the amount of data and training computation. Performance rises further with test-time compute such as in agentic workflows and in reasoning models that think, reflect, and iterate on an answer. But these methods take longer to produce output. Agents working in parallel offer another path to improve results, without making users wait. Reasoning models generate tokens sequentially and can take a long time to run. Similarly, most agentic workflows are initially implemented in a sequential way. But as LLM prices per token continue to fall — thus making these techniques practical — and product teams want to deliver results to users faster, more and more agentic workflows are being parallelized. Some examples: - Many research agents now fetch multiple web pages and examine their texts in parallel to try to synthesize deeply thoughtful research reports more quickly. - Some agentic coding frameworks allow users to orchestrate many agents working simultaneously on different parts of a code base. Our short course on Claude Code shows how to do this using git worktrees. - A rapidly growing design pattern for agentic workflows is to have a compute-heavy agent work for minutes or longer to accomplish a task, while another agent monitors the first and gives brief updates to the user to keep them informed. From here, it’s a short hop to parallel agents that work in the background while the UI agent keeps users informed and perhaps also routes asynchronous user feedback to the other agents. It is difficult for a human manager to take a complex task (like building a complex software application) and break it down into smaller tasks for human engineers to work on in parallel; scaling to huge numbers of engineers is especially challenging. Similarly, it is also challenging to decompose tasks for parallel agents to carry out. But the falling cost of LLM inference makes it worthwhile to use a lot more tokens, and using them in parallel allows this to be done without significantly increasing the user’s waiting time. I am also encouraged by the growing body of research on parallel agents. For example, I enjoyed reading “CodeMonkeys: Scaling Test-Time Compute for Software Engineering” by Ryan Ehrlich and others, which shows how parallel code generation helps you to explore the solution space. The mixture-of-agents architecture by Junlin Wang is a surprisingly simple way to organize parallel agents: Have multiple LLMs come up with different answers, then have an aggregator LLM combine them into the final output. There remains a lot of research as well as engineering to explore how best to leverage parallel agents, and I believe the number of agents that can work productively in parallel — like the humans who can work productively in parallel — will be very high. [Original text, with links: https://t.co/ElcJZyzcfw ]
Andrew Ng
Aug 27, 3:51 PM
Build better RAG by letting a team of agents extract and connect your reference materials into a knowledge graph. Our new short course, “Agentic Knowledge Graph Construction,” taught by @Neo4j Innovation Lead @akollegger, shows you how. Knowledge graphs are an important way to store information accurately but they are a lot of work to build manually. In this course you’ll learn how to build a team of agents that turn data– in this case product reviews and invoices from suppliers–into structured graphs of entities and relationships for RAG. Learn how agents can automatically handle the time-consuming work of building graphs — extracting entities and relationships (e.g., Product "contains" Assembly, Part "supplied_by" Supplier, Customer review "mentions" Product), deduplicating them, fact-checking them, and committing them to a graph database — so your retrieval system can find right information to generate accurate output. For example, you can use agents to help trace customer complaints directly to specific suppliers, manufacturing processes, and product hierarchies, thus turning fragmented information into queryable business intelligence. Skills you’ll gain: - Build, store, and access knowledge graphs using the Neo4j graph database - Build multi-agent systems using Google’s Agent Development Kit (ADK) - Set up a loop of agentic workflows to propose and refine a graph schema through fact-checking - Connect agent-generated graphs of unstructured and structured data into a unified knowledge graph This course gets into the practicum of why knowledge graphs give more accurate information retrieval than vector search alone, especially for high-stakes applications where precision matters more than fuzzy similarity matching. Sign up here: https://t.co/2txZfYqGZ9
Andrew Ng
Aug 21, 6:20 PM
On Saturday at the Buildathon hosted by AI Fund and https://t.co/zpIxRSuky4, over 100 developers competed to build software products quickly using AI assisted coding. I was inspired to see developers build functional products in just 1-2 hours. The best practices for rapid engineering are changing quickly along with the tools, and I loved the hallway conversations sharing tips with other developers on using AI to code! The competitors raced to fulfill product specs like this one (you can see the full list in our github repo; link in reply): Project: Codebase Time Machine Description: Navigate any codebase through time, understanding evolution of features and architectural decisions. Requirements: - Clone repo and analyze full git history - Build semantic understanding of code changes over time - Answer questions like “Why was this pattern introduced?” or “Show me how auth evolved” - Visualize code ownership and complexity trends - Link commits to business features/decisions Teams had 6½ hours to build 5 products. And many of them managed to do exactly that! They created fully functional applications with good UIs and sometimes embellishments. What excites me most isn’t just what can now be built in a few hours. Rather, it is that, if AI assistance lets us build basic but fully functional products this quickly, then imagine what can now be done in a week, or a month, or six months. If the teams that participated in the Buildathon had this velocity of execution and iterated over multiple cycles of getting customer feedback and using that to improve the product, imagine how quickly it is now possible to build great products. Owning proprietary software has long been a moat for businesses, because it has been hard to write complex software. Now, as AI assistance enables rapid engineering, this moat is weakening. While many members of the winning teams had computer science backgrounds — which does provide an edge — not all did. Team members who took home prizes included a high school senior, a product manager, and a healthcare entrepreneur who initially posted on Discord that he was “over his skis” as someone who “isn't a coder.” I was thrilled that multiple participants told me they exceeded their own expectations and discovered they can now build faster than they realized. If you haven’t yet pushed yourself to build quickly using agentic coding tools, you, too, might be surprised at what you can do! At AI Fund and https://t.co/zpIxRSuky4, we pride ourselves on building and iterating quickly. At the Buildathon, I saw many teams execute quickly using a wide range of tools including Claude Code, GPT-5, Replit, Cursor, Windsurf, Trae, and many others. I offer my hearty congratulations to all the winners! - 1st Place: Milind Pathak, Mukul Pathak, and Sapna Sangmitra (Team Vibe-as-a-Service), a team of three family members. They also received an award for Best Design. - 2nd Place: David Schuster, Massimiliano Viola, and Manvik Pasula. (Team Two Coders and a Finance Guy). - Solo Participant Award: Ivelina Dimova, who had just flown to San Francisco from Portugal, and who worked on the 5 projects not sequentially, but in parallel! - Graph Thinking Award: Divya Mahajan, Terresa Pan, and Achin Gupta (Team A-sync). - Honorable mentions went to finalists Alec Hewitt, Juan Martinez, Mark Watson and Sophia Tang (Team Secret Agents) and Yuanyuan Pan, Jack Lin, and Xi Huang (Team Can Kids). To everyone who participated, thank you! Through events like these, I hope we can all learn from each other, encourage each other, invent new best practices, and spread the word about where agentic coding is taking software engineering. [Original text: https://t.co/wJbQMrnZdL ]
Andrew Ng
Aug 20, 1:55 PM
AI Dev 25 is coming to NYC on November 14! 1,200+ developers will dive into technical topics such as: - Agentic AI: Multi-agent orchestration, tool use, complex reasoning chains - Coding with AI: Agentic coding assistants, automated testing, debugging strategies - Context engineering: Advanced RAG, structured context, memory systems - Multimodal AI: Vision-language models, audio processing, cross-modal architectures - Fintech applications: Fraud detection, credit modeling, regulatory compliance Our Pi Day AI Dev event sold out quickly, so we booked a bigger venue this time. Tickets available here: https://t.co/baLDrB1EPd
Andrew Ng
Aug 18, 4:09 PM
Just as many businesses are transforming to become more capable by using AI, universities are too. I recently visited the UK to receive an honorary doctorate from the University of Exeter’s Faculty of Environment, Science and Economy. @UniofExeter The name of this faculty stood out to me as a particularly forward-looking way to organize an academic division. Having Computer Science sit alongside Environmental Science and the Business School creates natural opportunities for collaboration across these fields. Leveraging AI leads a university to do things differently. Speaking with Vice Chancellor Lisa Roberts, Deputy Vice Chancellor Timothy Quine, and CS Department Head Andrew Howes, I was struck by the university leadership’s pragmatic and enthusiastic embrace of AI. This is not a group whose primary worry is whether students will cheat using AI. This is a group that is thinking about how to create a student body that is empowered through AI, whether by teaching more students to code, helping them use AI tools effectively, or showing them what’s newly possible in their disciplines. Exeter is a wonderful place to create synergies between AI, environmental science, and business. It hosts 5 of the world’s top 21 most influential climate scientists according to Reuters, and its scholars are major contributors to reports by the UN’s IPCC (Intergovernmental Panel on Climate Change) as well as pioneers in numerous areas of climate research including geoengineering, which I wrote about previously. Its Centre for Environmental Intelligence, a partnership with the Met Office (the UK’s national weather service), applies AI to massive climate datasets. More work like this is needed to understand climate change and strategies for mitigation and adaptation. Add to this its Business School — named Business School of the Year by the consultancy Times Higher Education — and you have the ingredients for building applications and pursuing interdisciplinary studies that span technological, environmental, and economic realities. Having been born in the UK and spent most of my career in Silicon Valley, I find it exciting to see Exeter’s leadership embrace AI with an enthusiasm I more often associate with California. The UK has always punched above its weight in research, and seeing that tradition continue in the AI era is encouraging. Just as every company is becoming an AI company, every university must become an AI university — not just teaching AI, but using it to advance every field of study. This doesn’t mean abandoning disciplinary expertise. It means maintaining technical excellence while ensuring AI enhances every field. Like almost all other universities and businesses worldwide, Exeter’s AI transformation is just beginning. But the enthusiastic embrace of AI by its leadership will give it momentum. As someone who is proud to be an honorary graduate of the university, I look forward to seeing what comes next! [Original text: https://t.co/Y1PyN17Qzs ]
Andrew Ng
Aug 13, 3:43 PM
Buildathon: The Rapid Engineering Competition livestreams this Saturday, August 16. Top developers will compete to build 5+ products in a single day using AI coding assistants – projects that traditionally took weeks. Watch live as they advance through semifinals and finals, and see how fast software can now be built! Register at https://t.co/3vAkmZDU4V
Andrew Ng
Aug 7, 5:30 PM
Recently Meta made headlines with unprecedented, massive compensation packages for AI model builders exceeding $100M (sometimes spread over multiple years). With the company planning to spend $66B-72B this year on capital expenses such as data centers, a meaningful fraction of which will be devoted to AI, from a purely financial point of view, it’s not irrational to spend a few extra billion dollars on salaries to make sure this hardware is used well. A typical software-application startup that’s not involved in training foundation models might spend 70-80% of its dollars on salaries, 5-10% on rent, and 10-25% on other operating expenses (cloud hosting, software licenses, marketing, legal/accounting, etc.). But scaling up models is so capital-intensive, salaries are a small fraction of the overall expense. This makes it feasible for businesses in this area to pay their relatively few employees exceptionally well. If you’re spending tens of billions of dollars on GPU hardware, why not spend just a tenth of that on salaries? Even before Meta’s recent offers, salaries of AI model trainers have been high, with many being paid $5-10M/year, although Meta has raised these numbers to new heights. Meta carries out many activities, including run Facebook, Instagram, WhatsApp, and Oculus. But the Llama/AI-training part of its operations is particularly capital-intensive. Many of Meta’s properties rely on user-generated content (UGC) to attract attention, which is then monetized through advertising. AI is a huge threat and opportunity to such businesses: If AI-generated content (AIGC) substitutes for UGC to capture people's attention to sell ads against, this will transform the social-media landscape. This is why Meta — like TikTok, YouTube, and other social-media properties — is paying close attention to AIGC, and why making significant investments in AI is rational. Further, when Meta hires a key employee, not only does it gain the future work output of that person, but it also potentially gets insight into a competitor’s technology, which also makes its willingness to pay high salaries a rational business move (so long as it does not adversely affect the company’s culture). The pattern of capital-intensive businesses compensating employees extraordinarily well is not new. For example, Netflix expects to spend a huge $18B this year on content. This makes the salary expense of paying its 14,000 employees a small fraction of the total expense, which allows the company to routinely pay above-market salaries. Its ability to spend this way also shapes a distinctive culture that includes elements of “we’re a sports team, not a family” (which seems to work for Netflix but isn’t right for everyone). In contrast, a labor-intensive manufacturing business like Foxconn, which employs over 1 million people globally, has to be much more price-sensitive in what it pays people. Even a decade ago, when I led a team that worked to scale up AI, I built spreadsheets that modeled how much of my budget to allocate toward salaries and how much to allocate toward GPUs (using a custom model for how much productive output N employees and M GPUs would lead to, so I could optimize N and M subject to my budget constraint). Since then, the business of scaling up AI has skewed the spending significantly toward GPUs. I’m happy for the individuals who are getting large pay packages. And regardless of any individual's pay, I’m grateful for the contributions of everyone working in AI. Everyone in AI deserves a good salary, and while the gaps in compensation are growing, I believe this reflects the broader phenomenon that developers who work in AI, at this moment in history, have an opportunity to make a huge impact and do world-changing work. [Original text: https://t.co/5wQe7foww8 ]
Andrew Ng
Aug 6, 2:16 PM
I'm thrilled to announce the definitive course on Claude Code, created with @AnthropicAI and taught by Elie Schoppik @eschoppik. If you want to use highly agentic coding - where AI works autonomously for many minutes or longer, not just completing code snippets - this is it. Claude Code has been a game-changer for many developers (including me!), but there's real depth to using it well. This comprehensive course covers everything from fundamentals to advanced patterns. After this short course, you'll be able to: - Orchestrate multiple Claude subagents to work on different parts of your codebase simultaneously - Tag Claude in GitHub issues and have it autonomously create, review, and merge pull requests - Transform messy Jupyter notebooks into clean, production-ready dashboards - Use MCP tools like Playwright so Claude can see what's wrong with your UI and fix it autonomously Whether you're new to Claude Code or already using it, you'll discover powerful capabilities that can fundamentally change how you build software. I'm very excited about what agentic coding lets everyone now do. Please take this course! https://t.co/HGM8ArDalK
Andrew Ng
Aug 5, 9:04 PM
I'm thrilled @OpenAI has released two open weight models. Thank you to all my friends at OpenAI for this gift! I'm also encouraged that from my quick tests gpt-oss-120b looks strong (though we should still wait for rigorous 3rd party evals).
Andrew Ng
Jul 31, 3:26 PM
There is now a path for China to surpass the U.S. in AI. Even though the U.S. is still ahead, China has tremendous momentum with its vibrant open-weights model ecosystem and aggressive moves in semiconductor design and manufacturing. In the startup world, we know momentum matters: Even if a company is small today, a high rate of growth compounded for a few years quickly becomes an unstoppable force. This is why a small, scrappy team with high growth can threaten even behemoths. While both the U.S. and China are behemoths, China’s hypercompetitive business landscape and rapid diffusion of knowledge give it tremendous momentum. The White House’s AI Action Plan released last week, which explicitly champions open source (among other things), is a very positive step for the U.S., but by itself it won’t be sufficient to sustain the U.S. lead. Now, AI isn’t a single, monolithic technology, and different countries are ahead in different areas. For example, even before Generative AI, the U.S. had long been ahead in scaled cloud AI implementations, while China has long been ahead in surveillance technology. These translate to different advantages in economic growth as well as both soft and hard power. Even though nontechnical pundits talk about “the race to AGI” as if AGI were a discrete technology to be invented, the reality is that AI technology will progress continuously, and there is no single finish line. If a company or nation declares that it has achieved AGI, I expect that declaration to be less a technology milestone than a marketing milestone. A slight speed advantage in the Olympic 100m dash translates to a dramatic difference between winning a gold medal versus a silver medal. An advantage in AI prowess translates into a proportionate advantage in economic growth and national power; while the impact won’t be a binary one of either winning or losing everything, these advantages nonetheless matter. Looking at Artificial Analysis and LMArena leaderboards, the top proprietary models were developed in the U.S., but the top open models come from China. Google’s Gemini 2.5 Pro, OpenAI’s o4, Anthropic’s Claude 4 Opus, and Grok 4 are all strong models. But open alternatives from China such as DeepSeek R1-0528, Kimi K2 (designed for agentic reasoning), Qwen3 variations (including Qwen3-Coder, which is strong at coding) and Zhipu’s GLM 4.5 (whose post-training software was released as open source) are close behind, and many are ahead of Meta’s Llama 4 and Google’s Gemma 3 — the U.S.’ best open-weights offerings. Because many U.S. companies have taken a secretive approach to developing foundation models — a reasonable business strategy — the leading companies spend huge numbers of dollars to recruit key team members from each other who might know the “secret sauce“ that enabled a competitor to develop certain capabilities. So knowledge does circulate, but at high cost and slowly. In contrast, in China’s open AI ecosystem, many advanced foundation model companies undercut each other on pricing, make bold PR announcements, and poach each others’ employees and customers. This Darwinian life-or-death struggle will lead to the demise of many of the existing players, but the intense competition breeds strong companies. In semiconductors, too, China is making progress. Huawei’s CloudMatrix 384 aims to compete with Nvidia’s GB200 high-performance computing system. While China has struggled to develop GPUs with a similar capability as Nvidia’s top-of-the-line B200, Huawei is trying to build a competitive system by combining a larger number (384 instead of 72) of lower-capability chips. China’s automotive sector once struggled to compete with U.S. and European internal combustion engine vehicles, but leapfrogged ahead by betting on electric vehicles. It remains to be seen how effective Huawei’s alternative architectures prove to be, but the U.S. export restrictions have given Huawei and other Chinese businesses a strong incentive to invest heavily in developing their own technology. Further, if China were to develop its domestic semiconductor manufacturing capabilities while the U.S. remained reliant on TSMC in Taiwan, then the U.S.’ AI roadmap would be much more vulnerable to a disruption of the Taiwan supply chain (perhaps due to a blockade or, worse, a hot war). With the rise of electricity, the internet, and other general-purpose technologies, there was room for many nations to benefit, and the benefit to one nation hasn’t come at the expense of another. I know of businesses that, many months back, planned for a future in which China dominates open models (indeed, we are there at this moment, although the future depends on our actions). Given the transformative impact of AI, I hope all nations — especially democracies with a strong respect for human rights and the rule of law — will clear roadblocks from AI progress and invest in open science and technology to increase the odds that this technology will support democracy and benefit the greatest possible number of people. [Full text: https://t.co/jn0KNi3gmA ]
Andrew Ng
Jul 23, 2:50 PM
Announcing our new event - Buildathon: The Rapid Engineering Competition. See the video for details, and please apply to participate!
Andrew Ng
Jul 21, 2:51 PM
The invention of modern writing instruments like the typewriter made writing easier, but they also led to the rise of writer’s block, where deciding what to write became the bottleneck. Similarly, the invention of agentic coding assistants has led to a new builder’s block, where the holdup is deciding what to build. I call this the Product Management Bottleneck. Product management is the art and science of deciding what to build. Because highly agentic coding accelerates the writing of software to a given product specification, deciding what to build is the new bottleneck, especially in early-stage projects. As the teams I work with take advantage of agentic coders, I increasingly value product managers (PMs) who have very high user empathy and can make product decisions quickly, so the speed of product decision-making matches the speed of coding. PMs with high user empathy can make decisions by gut and get them right a lot of the time. As new information comes in, they can keep refining their mental models of what users like or do not like — and thereby refine their gut — and keep making fast decisions of increasing quality. Many tactics are available to get user feedback and other forms of data that shape our beliefs about users. They include conversations with a handful of users, focus groups, surveys, and A/B tests on scaled products. But to drive progress at GenAI speed, I find that synthesizing all these sources of data in a PM's gut helps us move faster. Let me illustrate with an example. Recently, my team debated which of 4 features users would prefer. I had my instincts, but none of us were sure, so we surveyed about 1,000 users. The results contradicted my initial beliefs — I was wrong! So what was the right thing to do at this point? - Option 1: Go by the survey and build what users told us clearly they prefer. - Option 2: Examine the survey data in detail to see how it changes my beliefs about what users want. That is, refine my mental model of users. Then use my revised mental model to decide what to do. Even though some would consider Option 1 the “data-driven” way to make decisions, I consider this an inferior approach for most projects. Surveys may be flawed. Further, taking time to run a survey before making a decision results in slow decision-making. In contrast, using Option 2, the survey results give much more generalizable information that can help me shape not just this decision, but many others as well. And it lets me process this one piece of data alongside all the user conversations, surveys, market reports, and observations of user behavior when they’re engaging with our product to form a much fuller view on how to serve users. Ultimately, that mental model drives my product decisions. Of course, this technique does not always scale. For example, with programmatic online advertising in which AI might try to optimize the number of clicks on ads shown, an automated system conducts far more experiments in parallel and gathers data on what users do and do not click on, to filter through a PM's mental model of users. When a system needs to make a huge number of decisions, such as what ads to show (or products to recommend) on a huge number of pages, PM review and human intuition do not scale. But in products where a team is making a small number of critical decisions such as what key features to prioritize, I find that data — used to help build a good mental model of the user, which is then applied to make decisions very quickly — is still the best way to drive rapid progress and relieve the Product Management Bottleneck. [Original text: https://t.co/1tulDs3k7U ]
Andrew Ng
Jul 16, 3:15 PM
Announcing a new Coursera course: Retrieval Augmented Generation (RAG) You'll learn to build high performance, production-ready RAG systems in this hands-on, in-depth course created by https://t.co/zpIxRSuky4 and taught by @ZainHasan6, experienced AI and ML engineer, researcher, and educator. RAG is a critical component today of many LLM-based applications in customer support, internal company Q&A systems, even many of the leading chatbots that use web search to answer your questions. This course teaches you in-depth how to make RAG work well. LLMs can produce generic or outdated responses, especially when asked specialized questions not covered in its training data. RAG is the most widely used technique for addressing this. It brings in data from new data sources, such as internal documents or recent news, to give the LLM the relevant context to private, recent, or specialized information. This lets it generate more grounded and accurate responses. In this course, you’ll learn to design and implement every part of a RAG system, from retrievers to vector databases to generation to evals. You’ll learn about the fundamental principles behind RAG and how to optimize it at both the component and whole-system levels. As AI evolves, RAG is evolving too. New models can handle longer context windows, reason more effectively, and can be parts of complex agentic workflows. One exciting growth area is Agentic RAG, in which an AI agent at runtime (rather than it being hardcoded at development time) autonomously decides what data to retrieve, and when/how to go deeper. Even with this evolution, access to high-quality data at runtime is essential, which is why RAG is a key part of so many applications. You'll learn via hands-on experiences to: - Build a RAG system with retrieval and prompt augmentation - Compare retrieval methods like BM25, semantic search, and Reciprocal Rank Fusion - Chunk, index, and retrieve documents using a Weaviate vector database and a news dataset - Develop a chatbot, using open-source LLMs hosted by Together AI, for a fictional store that answers product and FAQ questions - Use evals to drive improving reliability, and incorporate multi-modal data RAG is an important foundational technique. Become good at it through this course! Please sign up here: https://t.co/81DSVlDEOW