Good afternoon, everybody, and thank you very much for inviting me today. It's a pleasure to be here. My name is Julien, and I'm a tech evangelist with AWS. I've been working with them for about two and a half years, and at the moment, I'm the AI evangelist for EMEA. It's good to be in Paris, even though I get to do this in English again. Maybe one day I'll get to do it in French in Paris.
Why am I standing in front of you today? Why do I think I have a good story to tell? You may have heard of Amazon.com, which started in 1995. Very early on, Jeff Bezos, the founder of Amazon, understood that personalizing and customizing the e-commerce experience was crucial. The point of having a website was to show customers products we thought they would like. This is an actual screenshot from the website at the time, probably from Netscape, if you remember that, from 1995. Some of you were born then, yeah? I was too. Over time, we kept working on this, and a really cool research paper came out last year. If you're into recommendations or machine learning in general, it covers two decades of recommendation at Amazon.com. It's a great history of the evolution of recommendation, one of the main use cases for machine learning at Amazon. Interestingly, it goes back to 1998 when Amazon filed its first patent for recommendations, a time when many large companies today didn't even exist. We were already doing it.
It's not just on the website; literally every part of Amazon now runs on machine learning. You've probably seen those robots in the fulfillment centers. We have tens of thousands of robots that pick up shelves and carry them to people who prepare your orders. They're not remote-controlled, by the way. The Amazon Echo is also based on AI and deep learning, handling natural language processing, text-to-speech, and speech-to-text. The device itself is just a device, but it's connected to the cloud and AWS services that enable it to interact with us using natural language.
You may have seen this one as well; it made the news recently. It's a store in Seattle called Amazon Go. What's special about it? You use your QR code in the mobile app to get in, pick stuff from the shelves, put the products in a bag, and walk out with no cashier or lines. You still have to pay, though. You can try running away, but there's a security guy at the front. Within minutes, you get your bill in the mobile app. It's now open to the general public, so if you're visiting Seattle, please drop by and try it for yourself. The way it works, as you'll see if you go there, involves hundreds of cameras on the ceiling doing image recognition and image processing to track what people are picking and doing with the products. It's a pretty cool use case.
This one is not a joke. When it was announced, people thought it was an April Fool's joke. It's a drone delivery service, still under test but coming your way. It's heavily based on machine learning as well. These are examples of Amazon using AI for 20 years internally and for customer purposes. However, I work for AWS, and our role is to build services that all of you can use. We want to ensure every developer, data scientist, startup, small company, or student can use the same AI tools as Amazon and Netflix. You can go and open an account on AWS and use those same services.
Today, we have three layers of services. The application services are super high-level; you don't need to know anything about deep learning to use them. You just call an API and get the job done. If you want to do image recognition, text-to-speech, speech-to-text, or NLP, including translation, you just call an API and get your result. Most of these are real-time. We have a lower layer of services for people who need to train on their own datasets, build their own models, and tweak every hyperparameter. We introduced a service called Amazon SageMaker at our tech conference in December called re:Invent. This is the one I'll focus on today because I believe it might be interesting to you.
At the lowest level, you can still fire up EC2 instances. We provide an Amazon Machine Image (AMI) pre-installed with all the deep learning libraries, NVIDIA drivers for GPU training, the Anaconda distribution, etc. You just fire up your instance, wait a few minutes, and you can get to work. Nothing would be interesting if people didn't actually use it. We believe more AI is running on AWS than anywhere else. These are just a few references: big ones like Netflix and Zillow, and startups like Duolingo. The list goes on, and it's growing. We have more references than anyone else, which shows that what we build is what customers want.
Last year, customers asked us to focus on making the machine learning process simpler. You know the steps: framing the problem, collecting data, cleaning it, integrating it into your platform, experimenting, building features, training, evaluating results, and going back to the whiteboard. Once you're happy with the model, you deploy it in production for inference to serve predictions to apps. You need to do monitoring and debugging, and periodically retrain with new data. This is what the machine learning process looks like.
When you're a data scientist and you just graduate, you think you'll be doing this and building cool visualizations. In reality, you spend most of your time managing servers, training with large datasets, and dealing with infrastructure. Once you have the model, you need to manage infrastructure to deploy it and serve predictions, ensuring high availability, scalability, and security. These are more DevOps topics than data science topics. Customers asked us to build something that simplifies deployment, training, and lets them focus on understanding, cleaning, preparing data, and getting the best results possible.
To address this, we built Amazon SageMaker. I'll do a quick demo afterward. When you start working with SageMaker, the first thing you do is create a notebook instance. Since the world is using Jupyter these days, we make it super easy to start a notebook instance. One click or one API call, wait a few minutes, and you have a pre-installed EC2 instance with Jupyter, ready to go. No setup, no messing with Python packages. It's all ready for you.
Some customers said they don't want to write machine learning code because maybe Amazon can do it better, they don't have the skills, or they don't have the budget. So, we provide built-in algorithms that you can use out of the box. For example, factorization machines are popular for recommendations. We provide our own implementation of factorization machines, which exhibits perfect scaling. If you train on 10, 20, or 50 machines, you get perfect linearity. You double the number of machines and halve the training time. We trained this on a one-terabyte ad tech dataset, and our implementation is competitive in accuracy and classification speed.
Another example is LDA topic modeling for natural language processing. We trained this on two datasets, and our implementation is faster. You can use this without writing a single line of machine learning code. I also wanted to show sequence-to-sequence, a popular method for machine translation. This is off the shelf and based on an open-source project called Sockeye, implemented with the Apache MXNet deep learning library. You can do multi-GPU training and machine translation. We trained it on an English-German translation dataset using three different instance types. Even with the smaller instance, we converge to state-of-the-art scores.
For those who want to tweak everything, you can bring your own training code, TensorFlow, MXNet, scikit-learn, PyTorch, and integrate SageMaker with Spark ML pipelines. The main feature is to get rid of infrastructure tasks. Using the SageMaker SDK, you can fire up as many instances as needed for training and shut them down once training is complete, paying only by the second.
We're working on a feature called hyperparameter optimization (HPO). Typically, the most difficult part of deep learning is figuring out the parameters. Using HPO, we can predict the optimal parameters for your training job with a limited number of trainings, taking away the guessing game and improving accuracy.
Once you have a working model, you want to deploy it into production to see how it performs with real traffic. This is often where machine learning projects fail. Using the SageMaker SDK, you can deploy the exact same model you trained in just one API call and invoke it using an HTTP API.
A few months ago, we introduced a new instance type called P3, supporting the latest Nvidia V100 GPUs. You can fire up instances with one, two, or eight GPUs, the most powerful ever. We also introduced a new family of CPU-based instances called C5, using the Skylake architecture with more cores, memory, and bandwidth. The C5 family supports the AVX 512 instruction set, which significantly speeds up machine learning and deep learning tasks.
Let's do a quick demo. If you have an AWS account, you can open SageMaker, available in four regions, including Ireland. You can create a notebook instance, see information about running jobs and hosted models. Here's the notebook instance I just opened. Once you open it, you get into Jupyter notebooks. We provide a large number of examples and you can find them on GitHub. We have notebooks showing how to use built-in algorithms, MXNet, TensorFlow, and Spark.
Let's look at our first example. I decided to show how to use an algorithm called BlazingText, which computes word embeddings, a prerequisite for many NLP tasks. We need to download the dataset, a tiny subset of Wikipedia. I need to copy my data to S3, select my algorithm, and configure my training job. I'll use two instances for distributed training, select the hyperparameters, and summarize everything before creating the job. I call the SageMaker API to create the training job, which fires up the instances, pulls the Docker container, injects the parameters, points to the dataset in S3, and starts training. Once it's over, I can see my vectors and even display them using t-SNE. Even if you're not a machine learning expert, you can do this.
Now, let's see how to use SageMaker with your existing machine learning or deep learning code. I'll show sentiment analysis using MXNet. We download a movie review dataset, copy it to S3, and use a vanilla MXNet training script. I create an MXNet object in the SageMaker SDK, specify the training code, use four instances for distributed training, set hyperparameters, and call fit to start training. I see the training log, and then I deploy the model using the SageMaker SDK. One line of code fires up an EC2 instance, pulls the MXNet container, injects the training script and model, and creates an HTTP endpoint. I can query it using the SageMaker SDK, which looks like a function call but is actually an API call. Let's try it. My binary classifier is working. Let's try another one. "I hate it." This is classified as a negative sentiment, which is exactly what I think of this movie.
You can also bring your own container if you want to use a library like Caffe or have custom training and prediction code in C++. You build a Docker container that complies with our simple specification, push it to a repository in AWS, and use it in the same way. You can go end-to-end from notebook to training to deployment, or pick whatever you like. You can work locally on your machine, call the SageMaker SDK to create training and prediction clusters, or use AWS for training and deploy on your infrastructure.
Here are a few resources before I go. The top-level website for machine learning at AWS provides descriptions of services and customer use cases. We have an AI blog with code, real-life examples, and more. The SageMaker page, the SageMaker SDK on GitHub, and the SageMaker Spark integration on GitHub are also available. All the notebooks I showed and many more are on GitHub. If you're interested in different ways to use SageMaker, I recorded a YouTube video that goes through notebooks in detail. I also blog on Medium about deep learning, MXNet, and more.
Thank you very much. Thanks again for inviting me. If you have any questions, I'll be happy to answer them.
Is there any question for this brilliant presentation from Julien? They want coffee. I can run for coffee and then I have to run. It's not the end of my week unfortunately.
Hi, what's the average cost of your presentation, for example? You mean my salary? It's more than anybody could afford, unfortunately. The only thing you'll pay for is the notebook instance you use. I'll find the pricing page because it changes all the time. You'll pay by the second for the notebook instance, but you don't have to use it. You can work on your laptop in your own Jupyter and perform SDK calls to SageMaker. You need an AWS account, but you can ignore that cost if it's not your preferred way of working.
For example, if you use the small notebook instance, which is more than enough, you pay 4.6 cents per hour. You can stop it when you don't need it and stop paying. In Ireland, you pay per second. Depending on the instance type you use for training, you'll pay a certain price. For example, the C4 to Excel instance costs 63 cents per hour, and my training lasted maybe 10 minutes, so it probably cost me six cents. It stops automatically when training is done, unlike long Hadoop clusters that run all the time. It's a very competitive way of managing infrastructure. The pricing is similar to EC2, but it includes the service. You pay for what you use and stop paying when you switch things off.
Do you get spot pricing on SageMaker? Spot pricing is a way to buy unused EC2 capacity at a deep discount, usually 70-80% or more. The catch is that if the market price exceeds your bid, you have two minutes before losing the instance. It's not a problem for stateless tasks like web servers or workers. We don't have Spot integration yet, but it's high on the list. For training jobs, it would make a lot of sense. If you could get 100 spot instances at a 90% discount, it would be the same cost as 10 on-demand instances, and the job should run 10 times faster. The chance of losing instances is low, especially if you distribute the training. Some customers use Hadoop on spot instances and recover from losing a few nodes. I'm definitely hoping to see spot in SageMaker.
Thank you very much, Julien. Thanks again.