Published: 2019-11-15 | Originally published at AWS Blog
Today, we’re very happy to announce that Amazon Personalize now supports batch recommendations.
Launched at AWS re:Invent 2018, Personalize is a fully-managed service that allows you to create private, customized recommendations for your applications, with little to no machine learning experience required.
With Personalize , you provide the unique signals in your activity data (page views, sign-ups, purchases, and so forth) along with optional customer demographic information (age, location, etc.). You then provide the inventory of the items you want to recommend, such as articles, products, videos, or music: as explained in previous blog posts, you can use both historical data stored in Amazon Simple Storage Service (Amazon S3) and streaming data sent in real-time from a JavaScript tracker or server-side.
Then, entirely under the covers, Personalize will process and examine the data, identify what is meaningful, select the right algorithms, train and optimize a personalization model that is customized for your data, and is accessible via an API that can be easily invoked by your business application.
However, some customers have told us that batch recommendations would be a better fit for their use cases. For example, some of them need the ability to compute recommendations for very large numbers of users or items in one go, store them, and feed them over time to batch-oriented workflows such as sending email or notifications: although you could certainly do this with a real-time recommendation endpoint, batch processing is simply more convenient and more cost-effective.
Let’s do a quick demo.
Introducing Batch Recommendations
For the sake of brevity, I’ll reuse the movie recommendation
solution
trained in this
post
on the
MovieLens
data set. Here, instead of deploying a real-time
campaign
based on this solution, we’re going to create a batch recommendation job.
First, let’s define users for whom we’d like to recommend movies. I simply list their user ids in a JSON file that I store in an S3 bucket.
{"userId": "123"}
{"userId": "456"}
{"userId": "789"}
{"userId": "321"}
{"userId": "654"}
{"userId": "987"}
Then, I apply a bucket policy to that bucket, so that Personalize may read and write objects in it. I’m using the AWS console here, and you can do the same thing programmatically with the PutBucketAcl API.
Now let’s head out to the Personalize console, and create a batch inference job.
As you would expect, I need to give the job a name, and select an AWS Identity and Access Management (IAM) role for Personalize in order to allow access to my S3 bucket. The bucket policy was taken care of already.
Then, I select the solution that I want to use to recommend movies.
Finally, I define the location of input and output data, with optional AWS Key Management Service (AWS KMS) keys for decryption and encryption.
After a little while, the job is complete, and I can fetch recommendations from my bucket.
$ aws s3 cp s3://jsimon-personalize-euwest-1/batch/output/batch/users.json.out -
{"input":{"userId":"123"}, "output": {"recommendedItems": ["137", "285", "14", "283", "124", "13", "508", "276", "275", "475", "515", "237", "246", "117", "19", "9", "25", "93", "181", "100", "10", "7", "273", "1", "150"]}}
{"input":{"userId":"456"}, "output": {"recommendedItems": ["272", "333", "286", "271", "268", "313", "340", "751", "332", "750", "347", "316", "300", "294", "690", "331", "307", "288", "304", "302", "245", "326", "315", "346", "305"]}}
{"input":{"userId":"789"}, "output": {"recommendedItems": ["275", "14", "13", "93", "1", "117", "7", "246", "508", "9", "248", "276", "137", "151", "150", "111", "124", "237", "744", "475", "24", "283", "20", "273", "25"]}}
{"input":{"userId":"321"}, "output": {"recommendedItems": ["86", "197", "180", "603", "170", "427", "191", "462", "494", "175", "61", "198", "238", "45", "507", "203", "357", "661", "30", "428", "132", "135", "479", "657", "530"]}}
{"input":{"userId":"654"}, "output": {"recommendedItems": ["272", "270", "268", "340", "210", "313", "216", "302", "182", "318", "168", "174", "751", "234", "750", "183", "271", "79", "603", "204", "12", "98", "333", "202", "902"]}}
{"input":{"userId":"987"}, "output": {"recommendedItems": ["286", "302", "313", "294", "300", "268", "269", "288", "315", "333", "272", "242", "258", "347", "690", "310", "100", "340", "50", "292", "327", "332", "751", "319", "181"]}}
In a real-life scenario, I would then feed these recommendations to downstream applications for further processing. Of course, instead of using the console, I would create and manage jobs programmatically with the CreateBatchInferenceJob , DescribeBatchInferenceJob , and ListBatchInferenceJobs APIs.
Now Available!
Using batch recommendations with
Amazon Personalize
is an easy and cost-effective way to add personalization to your applications. You can start using this feature today in all regions where
Personalize
is available.
Please send us feedback, either on the AWS forum for Amazon Personalize , or through your usual AWS support contacts.
— Julien
Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at AWS and Chief Evangelist at Hugging Face, Julien has authored books on Amazon SageMaker and contributed to the open-source AI ecosystem. His mission is to make AI accessible, understandable, and controllable for everyone.