Machine learning (ML) is an inherently disruptive technology because the algorithm architectures are evolving so fast and are very compute intensive, requiring innovative silicon for acceptable performance. This blog looks at where we’ve been and where ML is going – into another market ready for disruption.

ML started in the data center

In the early days of the ML explosion – a mere 8 or 9 years ago – all the action in world of ML was in the data center. Data scientists continuously discovered new network architectures and trained ever larger workloads in the cloud. At first, they used the available generic cloud computing CPU nodes, but rapidly moved to higher performance Graphic Processing Unit (GPU) cards for ML training.

Shortly afterwards a new breed of ML-specific silicon both for training and inference, such as Google’s Tensor Processing Unit (TPU), started to appear in the data center. In total, over a span of a few short years, the emergence of ML dramatically reshaped both the architecture of silicon for data centers and the roster of silicon providers in the space.

ML appears in mobile phones

Machine learning did not stay trapped in the cloud for long, however. For a host of reasons, ML workloads migrated into a variety of devices and endpoints. The most prominent example has been the mobile phone. The dynamics of the mobile phone industry are such that the major silicon players spin new versions of their flagship platforms every year. Benchmark performance of CPUs and GPUs were, for many years, the litmus tests of performance leadership, especially GPU performance on game rendering. But then ML algorithms started to appear more frequently in mobile phones in 2014-15, mostly being found in computational photography algorithms to improve cell phone photos and videos.

In 2017, the first dedicated ML accelerator appeared in an Apple iPhone in the A11 SoC. In 2018, a second-generation neural processing unit (NPU) appeared in in the iPhone platform as well as the software APIs needed to open that NPU to the large application developer community. The response was overwhelmingly positive, and the race was on among mobile phone SoC developers to push ever more ML performance into phone handsets.

Fast forward a mere four years to the fall 2022 unveiling of the most recent A-series mobile phone processor and the now 17 TOPS NPU appeared in die photos to consume more silicon area than the entire GPU subsystem. In that short five-year window the driving forces of mobile phone SoC had been completely upended by the sudden emergence of machine learning.

What’s the next big semiconductor segment for ML?

First the data center silicon market changed. Then the mobile phone SoC. Several other small niche silicon markets have also been radically reshaped by the performance demands of machine learning. But what is the next big semiconductor segment that will feel the impact of ML?

Chances are very good that you are reading this blog right now on a laptop or desktop PC. 91% of the worldwide PCs shipped in 2021 were powered by x86 processors, none of which have dedicated NPUs like the phone SoCs mentioned previously. Two and three decades ago, users and buyers of desktop and laptop PCs paid exquisite attention to the feeds and speeds of the silicon powering a new machine. Those of you old enough to remember the GHz wars will recall the ever-escalating processor speeds advertised on websites, in stores, and in commercials.

But that was an entire generation of humans ago. If you are reading this today on a company-issued work laptop, chances are that if it suddenly died tomorrow you would neither care nor pay any attention to the specifications of the main processor in the new laptop that the corporate IT department delivered to you. You would simply be thrilled that a new machine showed up quickly, with everything running seamlessly because all of your settings and files are backed up in the cloud. If your IT team was top-notch, you could go to lunch and return to a new machine that simply worked, and never even look at the processor specs.

When is last time you carefully read the specs for numbers of cores, GHz, and storage in your work laptop? If you are an average business or managerial user, chances are it’s been a decade or more since you paid attention to anything beyond storage capacity and battery life. These platforms are ubiquitous and necessary productivity tools, but the underlying silicon has been quite boring. That is very likely about to change.

Generative ML models – such as Stable Diffusion 2.0 and DALL-E – are poised to radically shakeup the established, boring ubiquitous platforms powering desktops, laptops, and most tablets. This new generation of generative machine learning for image creation and enhancement has exploded in popularity within the past nine months. These new algorithms can both create entirely new images based on text input descriptions or modify existing images (add, delete, or blend objects). Fanciful implausible ideas such as “astronaut on a horse,” or “Cal Bears football team wins the national championship trophy,” can be easily created by novices without artistic skills or Photoshop experience.

(Image from Wikipedia Commons, created by Stable Diffusion 1.0)

Today these tools are almost within reach of the everyday average business user, but they require significantly more compute than what is available in the standard $2000 business laptop. For example, Stable Diffusion 2.0 running on today’s top of the line $1500 desktop GPU add-in card in a high-performance PC on a 512×512 image with a Sample Step of 200 has a runtime of 2 minutes. That same workload takes more than 30 minutes on a standard issue “business laptop.” A typical user won’t have the patience to wait 30 minutes, let alone iterate several times. If modifying a single still image consumes an hour or more of the workday then editing a short video is impractical! That user could access the tools online through various cloud compute resources but that can be expensive if used frequently. [OpenAI – the ChatGPT creators – are in the news this month raising expansion funding at a $30B enterprise value. Investors are bidding up the company value in expectation of continued meteoric growth in usage and revenue!]

We predict that it will be but a blink in time before demand for these generative image – and video – enhancement tools makes its way in to the common everyday tools used by hundreds of millions of business and managerial workers. As a business manager and marketing content creator I can attest to the countless times when I wanted to communicate something in slides that could best be implemented by an animation or image or image/video sequence but the project wasn’t big enough to warrant the expense and delay of going to an outside creative artist agency to create the ideal artwork. If the trusty and ubiquitous PowerPoint had a series of generative image/video tools built-in, the number of occasions in which I would attempt to create novel imagery would skyrocket. Or think of the myriad ways in which technical documentation (block diagrams, flow charts) could be enhanced to communicate more information as animated sequences. Today’s word processing tools already have significant predictive text capability, and published reports in early January suggest that Microsoft is considering integrating OpenAI’s ChatGPT into the Office suite of tools to massively expand the autogenerative writing capability of those products.

The opportunities are numerous for ML to enrich the standard business software toolset, but today’s silicon platforms are grossly lacking in the ML inferencing compute horsepower to enable such tooling to run in a time-effective manner on a laptop while sitting in the aisle seat of an aircraft at 30,000 ft. If the experience from the mobile phone semiconductor market is an appropriate guide, then we might expect to see the rapid emergence of heavy duty, highly programmable machine learning compute power in silicon designed for the business, consumer, and educational laptop markets. Indeed, the start of the new year saw announcements from AMD of next-generation data center silicon that integrated AI acceleration with x86 compute and promises of that merged AI-centric architecture to migrate down into the laptop later in 2023. Similarly, Intel’s next Meteor Lake platform is widely rumored to include ML acceleration in the chipset.

Much like the rapid flowering of a variety of ML design approaches witnessed in the mobile phone market, one might expect that the SoC architectures for PCs, laptops, and tablets of all shapes and performance levels might soon experience a renaissance of design variety that will change the segment from “boring” to “very exciting” in a blink of the eye.

Steve Roddy

  (all posts)
Steve Roddy is the chief marketing officer at Quadric.io. Previously, he was vice president of the Machine Learning Group at Arm, and before that he served as vice president for IP licensing businesses at Tensilica (acquired by Cadence), and Amphion Semiconductor. He also held product management roles at Synopsys, LSI Logic, and AMCC.

Source: https://semiengineering.com/the-next-disruption/