Column This time last year the latest trend in computing became impossible to ignore: huge slabs of silicon with hundreds of billions of transistors – the inevitable consequence of another set of workarounds that kept Moore’s Law from oblivion.
But slumping PC sales suggest we don’t need these monster computers – and not just because of a sales shadow cast by COVID.
In the first half of 2022, corporate computing looked pretty much the same as it had for the last decade: basic office apps, team communication apps, and, for the creative class, a few rich media tools. Sure, gamers would always find a way to put those transistors to work, but the vast majority of hardware was already overpowered and underworked. Why waste transistors on solved problems?
Then the world changed. A year ago, OpenAI launched DALL-E, the first of the widely available generative AI tools – a “diffuser” that converts noise, a text prompt, and a massive database of weightings into images. It seemed almost like magic. Not long after, Midjourney offered much the same – though tuned to a decidedly ’70s Prog Rock album cover aesthetic. It seemed as though demand for cloud computing would skyrocket as these tools found their way into products from Microsoft, Canva, Adobe and others.
Then the world changed again. In August, Stability AI introduced an open source database of diffuser weightings. At its start, Stable Diffusion demanded a state-of-the-art GPU, but the open source community soon found it could optimize the diffuser to run on, well, pretty much anything. It wouldn’t necessarily be fast, but it would work – and it would scale up with your hardware.
Instead of demanding massive cloud resources, these newer AI tools run locally. And if you purchased a monster computer they’d run at least as speedily as anything on offer from OpenAI or Midjourney – without a subscription.
The ever-excitable open source community driving Stable Diffusion created an impressive series of new diffuser weightings, each targeting a specific aesthetic. Stable Diffusion isn’t merely as fast as anything offered by a commercial AI firm – it’s both more useful and more extensible.
And then – yes, you guessed it – the world changed again. At the start of December, OpenAI’s ChatGPT completely rewrote our expectations for artificial intelligence, becoming the fastest web app to reach 100 million users. A large language model (LLM) powered by a “generative pre-trained transformer” – how many of us have forgotten that’s what GPT stands for? – that trained its weightings on the vast troves of text available on the internet.
That training effort is estimated to have cost millions (possibly tens of millions) in Azure cloud computing resources. That cost of entry had been expected to be enough to keep competitors at bay – except perhaps for Google and Meta.
Until, yet again, the world changed. In March, Meta released LLaMA – a much more compact and efficient language model, with a comparatively tiny database of weightings, yet with response quality approaching OpenAI’s GPT-4.
With a model of only thirty billion parameters, LLaMA can comfortably sit in a PC with 32GB of RAM. Something very like ChatGPT – which runs on the Azure Cloud because of its massive database of weightings – can be run pretty much anywhere.
Meta’s researchers offered their weightings to their academic peers, free to download. As LLaMA could run on their lab computers, researchers at Stanford immediately improved LLaMA through their new training technique called Alpaca-Lora, which cut the cost of training an existing set of weightings from hundreds of thousands of dollars down to a few hundred dollars. They shared their code, too.
Just as DALL-E lost out to Stable Diffusion for usability and extensibility, ChatGPT looks to be losing another race, as researchers produce a range of models – such as Alpaca, Vicuña, Koala, and a menagerie of others – that train and re-train quickly and inexpensively.
They’re improving far more rapidly than anyone expected. In part that’s because they’re training on many ChatGPT “conversations” that have been shared across sites like Reddit, and they can run well on most PCs. If you have a monster computer they run very well indeed.
The machines for which we couldn’t dream up a use just a year ago have found their purpose: they’re becoming the workhorses of all our generative AI tasks. They help us code, plan, write, draw, model, and much else besides.
And we won’t be beholden to subscriptions to make these new tools work. Tt looks as though open source has already outpaced commercial development of both diffusers and transformers.
Open source AI has also reminded us of why the PC proliferated: by making it possible to bring home tools that were once only available in the office.
This won’t close the door to commerce. If anything, it means that there’s more scope for entrepreneurs to create new products, without worrying about whether they infringe on the business models underlying Google, Microsoft, Meta or anyone else. We’re headed into a time of pervasive disruption in technology – and size doesn’t seem to confer many advantages.
The monsters are on the loose. I reckon that’s a good thing. ®