Now available: Batch Recommendations in Amazon Personalize

Published: 2019-11-15 | Originally published at AWS Blog

Today, we’re very happy to announce that Amazon Personalize now supports batch recommendations.

Launched at AWS re:Invent 2018, Personalize is a fully-managed service that allows you to create private, customized recommendations for your applications, with little to no machine learning experience required.

With Personalize , you provide the unique signals in your activity data (page views, sign-ups, purchases, and so forth) along with optional customer demographic information (age, location, etc.). You then provide the inventory of the items you want to recommend, such as articles, products, videos, or music: as explained in previous blog posts, you can use both historical data stored in Amazon Simple Storage Service (Amazon S3) and streaming data sent in real-time from a JavaScript tracker or server-side.

Then, entirely under the covers, Personalize will process and examine the data, identify what is meaningful, select the right algorithms, train and optimize a personalization model that is customized for your data, and is accessible via an API that can be easily invoked by your business application.

However, some customers have told us that batch recommendations would be a better fit for their use cases. For example, some of them need the ability to compute recommendations for very large numbers of users or items in one go, store them, and feed them over time to batch-oriented workflows such as sending email or notifications: although you could certainly do this with a real-time recommendation endpoint, batch processing is simply more convenient and more cost-effective.

Let’s do a quick demo.

Introducing Batch Recommendations
For the sake of brevity, I’ll reuse the movie recommendation solution trained in this post on the MovieLens data set. Here, instead of deploying a real-time campaign based on this solution, we’re going to create a batch recommendation job.

First, let’s define users for whom we’d like to recommend movies. I simply list their user ids in a JSON file that I store in an S3 bucket.

{"userId": "123"} {"userId": "456"} {"userId": "789"} {"userId": "321"} {"userId": "654"} {"userId": "987"}

Then, I apply a bucket policy to that bucket, so that Personalize may read and write objects in it. I’m using the AWS console here, and you can do the same thing programmatically with the PutBucketAcl API.

Now let’s head out to the Personalize console, and create a batch inference job.

As you would expect, I need to give the job a name, and select an AWS Identity and Access Management (IAM) role for Personalize in order to allow access to my S3 bucket. The bucket policy was taken care of already.

Then, I select the solution that I want to use to recommend movies.

Finally, I define the location of input and output data, with optional AWS Key Management Service (AWS KMS) keys for decryption and encryption.

After a little while, the job is complete, and I can fetch recommendations from my bucket.

$ aws s3 cp s3://jsimon-personalize-euwest-1/batch/output/batch/users.json.out - {"input":{"userId":"123"}, "output": {"recommendedItems": ["137", "285", "14", "283", "124", "13", "508", "276", "275", "475", "515", "237", "246", "117", "19", "9", "25", "93", "181", "100", "10", "7", "273", "1", "150"]}} {"input":{"userId":"456"}, "output": {"recommendedItems": ["272", "333", "286", "271", "268", "313", "340", "751", "332", "750", "347", "316", "300", "294", "690", "331", "307", "288", "304", "302", "245", "326", "315", "346", "305"]}} {"input":{"userId":"789"}, "output": {"recommendedItems": ["275", "14", "13", "93", "1", "117", "7", "246", "508", "9", "248", "276", "137", "151", "150", "111", "124", "237", "744", "475", "24", "283", "20", "273", "25"]}} {"input":{"userId":"321"}, "output": {"recommendedItems": ["86", "197", "180", "603", "170", "427", "191", "462", "494", "175", "61", "198", "238", "45", "507", "203", "357", "661", "30", "428", "132", "135", "479", "657", "530"]}} {"input":{"userId":"654"}, "output": {"recommendedItems": ["272", "270", "268", "340", "210", "313", "216", "302", "182", "318", "168", "174", "751", "234", "750", "183", "271", "79", "603", "204", "12", "98", "333", "202", "902"]}} {"input":{"userId":"987"}, "output": {"recommendedItems": ["286", "302", "313", "294", "300", "268", "269", "288", "315", "333", "272", "242", "258", "347", "690", "310", "100", "340", "50", "292", "327", "332", "751", "319", "181"]}}

In a real-life scenario, I would then feed these recommendations to downstream applications for further processing. Of course, instead of using the console, I would create and manage jobs programmatically with the CreateBatchInferenceJob , DescribeBatchInferenceJob , and ListBatchInferenceJobs APIs.

Now Available!
Using batch recommendations with Amazon Personalize is an easy and cost-effective way to add personalization to your applications. You can start using this feature today in all regions where Personalize is available.

Please send us feedback, either on the AWS forum for Amazon Personalize , or through your usual AWS support contacts.

— Julien