The Computer Architecture of AI (in 2024)
Over the last year, as a person with a hardware background, I have heard a lot of complaints about Nvidia’s dominance of the machine learning market and whether I can build chips to make the situation better. While the amount of money I would expect it to take is less than $7 trillion, hardware accelerating this wave of AI will be a very tough problem–much tougher than the last wave focused on CNNs–and there is a good reason that Nvidia has become the leader in this field with few competitors. While the inference of CNNs used to be a math problem, the inference of large language models has actually become a computer architecture problem involving figuring out how to coordinate memory and I/O with compute to get the best performance out of the system.