Man configuring servers in server room
An arrow pointing leftHome

Amazon just massively expanded the availability of its cloud AI ecosystem

  • Mike Pearl
5/5/2022

Too small to afford a dedicated GPU to run your site’s AI? Go Serverless.

Amazon is expanding a powerful new tool for businesses and institutions that want to deploy machine learning — just not all the time.

At AWS’s Global Summit, Amazon announced an expansion of Serverless Inference for its SageMaker AI ecosystem. The option, previously available to users in Oregon and six other regions worldwide, is now officially available everywhere.

Launched in 2017, Sagemaker was designed to be an all-in-one environment within Amazon Web Services for AI engineers and data scientists to prepare data, and then build, train and deploy machine learning models. Per Amazon, Sagemaker “enables you to easily deploy machine learning models for inference without having to configure or manage the underlying infrastructure.” It is, according to the company, “ideal for applications with intermittent or unpredictable traffic.”

As with all of the Seattle-based tech giant’s cloud services, the idea at the outset was to make complex infrastructure available to many, many more people and organizations. Serverless Inference could be yet another way to further that same expansion.

“SIs are a very core component of democratizing machine learning and making it accessible across all our customers. They are a very important part of our go-to-market motion,” AWS vice president Bratin Saha told Computer Reseller News.

Serverless Inference, first announced in December of last year, relieves certain SageMaker users — depending on their throughput and latency — of the need to link a machine learning endpoint to something like an expensive dedicated CPU or GPU instance to run a model.

Instead, they can now select the Serverless option, and they’ll only use all that costly computing power when receiving requests.

For users running AI for high-volume retailers, or other online services that run all day long, the Serverless option will feel underpowered, needing to ramp up when requests start to come in, rather than being ready for use all the time. But Serverless Inference could be a godsend for intermittent applications, such as — to use the example provided by the AWS website — payroll processing company chatbots, since users are most likely to have payroll questions at the end of the month.

There are indeed 21 regions now available on the SageMaker Serverless Inference pricing page — up from seven when the preview was first announced in December.