AWS AI ML Podcast Episode 17 AWS news

Transcript

Hi everybody, this is Julien from Arcee. Welcome to episode 17 of my podcast. I hope you're still doing okay in these strange times. I hope you're safe wherever you are. Please don't forget to subscribe to my channel to be notified of future videos. This week, lots of exciting AWS news on high-level services, SageMaker, and PyTorch. So let's not wait and let's get started. As usual, let's start with the high-level services. The big news is the general availability of Amazon Augmented AI. Amazon Augmented AI was launched in preview at re:Invent and now everyone can use it. So what is this service? Well, this service lets you build review workflows for recognition, text tracking, or a custom workflow. Basically, it's a way to have a human in the loop to examine predictions that have a low confidence score. This is what it looks like in the console. You'll find it in the SageMaker console. It's a little bit similar to SageMaker Ground Truth. First, you need to create a review workforce. This could be a Mechanical Turk, a private workforce, or a vendor workforce. Then you create a workflow. As mentioned, the workflow can be text, text tracking, recognition, or a custom workflow. The way this works is you push data to, let's say, text tracking, and if the confidence score for the text tracking prediction is below a certain threshold that you define, then the sample is sent for human review to your workforce. In a way, you get the best of both worlds. You can automate prediction with a high-level service or with a custom workflow, and you can make sure that low confidence scores are reviewed by humans. This is useful because no machine learning model will ever get to 200% accuracy. So this is a really cool service, and you should definitely try it out. What else do we have? One of my favorite services. Transcribe Medical now supports custom vocabulary. As you know, Transcribe Medical is a speech-to-text service specialized for medical vocabulary. It works exactly the same as Transcribe for custom vocabulary. Basically, you just create a text file with words and pass that to Transcribe. Just upload it with the create vocabulary file. This is how Transcribe Medical will transcribe your custom words. If you have words or specific vocabulary that needs to be transcribed exactly right, such as drug names, you can use custom vocabulary. That's a good way to improve the accuracy of your transcriptions, so that's pretty nice. Moving on to Translate, batch translation is now available in Europe, London. So good news for UK and Ireland customers. Lex is also available again in London, Frankfurt, and Asia Pacific. Lex is the chatbot service, so now you can use it in traditional regions. It's always good to know, as it cuts on latency, and if you have data in those regions, it's easier to work with a local version of the service. The really big news is the general availability of SageMaker notebooks, which is the notebook element in SageMaker Studio. SageMaker Studio is now available in additional regions, which is probably the number one question I was getting these days: when do we get SageMaker Studio outside of US East 2? Well, that's it. Now you can use Studio in Ohio (US East 1), North Virginia (US West 2), Oregon, and Europe. Here's a SageMaker Studio instance that I created in EU West 1. It looks the same. There are a few extra things. For example, if I open this notebook, the collaboration feature is now available. This is one of the things we discussed at re:Invent: the ability to take notebook snapshots and share them. This is now actually available. We also have different compute environments. During the preview, you could only use the smallest compute environment, but now you can actually use different ones, CPU and GPU. If I look at the full list, you can see a long list of compute environments that are available. So that's pretty good. You can find the exact environment that works for you. I'm quite sure there are a few more bells and whistles that I haven't caught yet, but anyway, that's really good news. Notebooks is now GA and more stable. It's available in three additional regions. Now, if you've never tried SageMaker Studio, the time is right, I think. It's a really good time to do it. Of course, you'll find tons of videos on my YouTube channel on how to do that. Let's keep going. This is another SageMaker announcement. I think it's pretty important. You can now use Inferentia on SageMaker and Inf1 instances. Let me explain. Inferentia is a custom chip built by AWS to provide high throughput, low-cost inference for customers who really need to scale their prediction infrastructure beyond what's possible with GPU instances. The Inferentia chip is available in EC2 instances, and we've built a specific SDK that lets you compile your deep learning models for INF1 instances and deploy them, etc. But so far, you had to do this on EC2 and build the compilation and deployment pipeline yourself. Now, it's available in SageMaker. I wrote the blog post for it, so I'll include the link in the description. It is a super simple integration. You can see it here. The only thing you have to do is compile the model that you trained and compile it with SageMaker, which is a pre-existing capability designed to compile models for specific hardware architectures. Now, as you can see here, inf1 is one of those target architectures. You just use the Neo API to compile the model, and then you just call deploy. Of course, the instance type is going to be an inf1 instance. This is really important, as simple as it gets, and again, this is one of the reasons why I like SageMaker so much. Doing this on EC2 is a much more involved process, and here it's literally two lines of code to deploy a hardware-optimized model on a really fast custom chip. This is a cool feature, so you can check out the blog post. When it comes to frameworks, we have also pretty big announcements. We worked with Facebook on a model server for PyTorch. This is called TorchServe, which is really the equivalent of TensorFlow serving for TensorFlow. This was a pretty big gap for PyTorch users. PyTorch is really great for experimentation and is very flexible, but when it comes to deploying models, it was missing a production-grade model server to serve predictions at scale. This gap is now filled by TorchServe. Again, I wrote a blog post for this, and you can go and read all about it. It's really easy to install. Of course, it is open source, so you can go and grab it from the repository. It supports single-model, multi-model loading, HTTPS, monitoring capabilities, and more. It has all the production features you would expect from a model server. So again, really good news because I think this was a strong ask from the PyTorch community. I'm really happy AWS worked on this and contributed the code for this project. So there you go. You can read all about it. That's it for this week. I hope you learned a few things. Again, don't forget to subscribe to my channel to be notified of future videos. I'll see you soon with more news. Until then, long live the zombie apocalypse and keep rocking!

AWS AI ML Podcast Episode 17 AWS news

Transcript

Tags

About the Author