AWS AI Machine Learning Podcast Episode 12 AWS news and demos

March 03, 2020
In this episode, I go through our latest announcements on Amazon Transcribe, Amazon Rekognition, Amazon Forecast and the Deep Learning Containers. I do a couple of demos (redacting personal information in text transcripts, and extracting text from videos). Finally, I share a couple of SageMaker videos that I recently recorded. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future episodes ⭐️⭐️⭐️ Additional resources mentioned in the podcast: * Amazon Transcribe blog post: https://aws.amazon.com/blogs/aws/now-available-in-amazon-transcribe-automatic-redaction-of-personally-identifiable-information/ * Amazon Rekognition Video code: https://github.com/juliensimon/aws/blob/master/AmazonAI/rekognitionvideo/textinimage.py * DeepAR research paper: https://arxiv.org/abs/1704.04110 * Train with Amazon SageMaker on your local machine: https://www.youtube.com/watch?v=K3ngZKF31mc * Amazon SageMaker Studio Deep Dive: https://www.youtube.com/watch?v=pGhn8Ax8QmQ This podcast is also available in audio at https://julsimon.buzzsprout.com. For more content, follow me on: * Medium https://medium.com/@julsimon * Twitter https://twitter.com/@julsimon

Transcript

Hi everybody, this is Julien from Arcee. Welcome to episode 12 of my podcast. Don't forget to subscribe to my channel to be notified of future videos. In this episode, I'm going to go through the latest announcements on services like Transcribe, Forecast, and a couple more things. Of course, I will do some demos and I will share some additional resources at the end. So let's not wait, let's do the news! Let's start with Amazon Transcribe, our speech-to-text service. You may remember my episode 1 demo on profanity filtering. If you haven't seen that, I recommend it. Here, we added the capability to automatically redact personally identifiable information. The use case for this is, of course, if you have customer calls or customer discussions that contain PII, you may not want those files to be stored as is. You may have to remove PII information from the sound files or from the transcripts. So one way or another, you need to locate this information in the file and remove it. And this is exactly what this feature does. I wrote the blog post for this. I will include the link in the video description. And let's do a quick demo here. So we can see all the information that Transcribe will detect and remove: social security numbers, any credit card information, banking information, and of course, names and email addresses, etc. I recorded a short file. Let's listen to this. Good morning, everybody. My name is Julian Simon. And today I feel like sharing a whole lot of personal information with you. Let's start with my social security number. One, two, three. My credit card number is 65280559 and my CVV code is 666. My bank account number is 888-005-6298. My email address is julian at... Okay, so obviously it's all fake. Don't worry about this. Well, I guess my CVV might just be 668. But I guess I need to check that. So that's my sound file. I put it in an S3 bucket. And then I use the start transcription job API, which is available in our SDKs. And I wrote that bit of PHP code, which seemed to cause a lot of distress from my colleagues. Because it's a well-known fact, I have no love for PHP. But hey, you guys are using it. So I should try and use it too. Call this API, wait for a little bit. And of course, we can see stuff in the console. And then we can grab the output from that command. It's a JSON file, no surprise. And it has a whole bunch of information that you would normally find in Transcribe. And of course, it has the transcription with PII redacted. So every bit of PII is actually replaced automatically by a PII tag. And you have timestamps. So if you want to go and do additional audio editing on top of this to actually remove that information from the sound file itself, you can absolutely do that using the timestamps and audio editing software. So that's it for Transcribe, and it's available pretty much everywhere. So that's pretty cool. Nice little feature. Right, let's talk about Recognition now. So Recognition added yet another capability, which is detecting text in videos. And that's super useful because you may want to look for news headlines, company names, or any kind of information, subtitles, why not? And that's going to come in handy. You can also restrict the area of the video where you want to extract text because maybe if you're looking for subtitles, obviously they'll be located in a very specific part of the video. So you can do that and ignore text that would show up somewhere else in the video. So let's do a quick demo of this. Okay, so this is the Recognition console. And I've uploaded a bunch of videos here. So, well, I have this one. Let's look. Woo! All right. All right. Thanos for the win. So it's a short video. There isn't a lot of text in it. Probably just the logo at the beginning here. So it's already uploaded in S3. And let's try and run a bit of code to see if we can pick up the text here. So let me show you the code. It's super simple. This is Python because it's enough PHP for a lifetime now. And we can use the start text detection API to get everything going. So just pass the location of your video, the bucket name, the video object name, and that's it. You get a response with a job ID. Then once the video has been processed, you can just wait for a bit or you can use an SNS notification. Recognition supports that. We'll send an SNS message once the video has been processed. And you can call the GetTextDetection API, passing the job ID and extracting the information. So I've done this before and you would use it like this, right? Start text detection with the location of the video and then just get text detection results. Okay, and this is what you get. Analysis is very fast. It really takes no time at all. It's a short video, but it was just a few seconds, really. I was surprised how fast it was. And so we can print out some information. So we have timestamps. We have the detected text. So whether it's a line or a word, we give you both. We can see detected text, we see the timestamp, so Marvel Studios, the confidence, which is very high, and then we'll tell you if it's a line of text or if it's a word of text. Because I commented out this bit here that shows you the bounding box, because the output gets really noisy, you get the exact location of that line or that word. So if you're looking for specific words and they're part of a line, then you know exactly where that word is, which is what we see here. The Marvel Studios line and then information for each word. And of course, these are present at multiple timestamps, so they will appear multiple times. So this is super simple. And I think it's going to come in handy in a lot of use cases for customers. Okay, that's it for Recognition. Let's talk about Amazon Forecast. So Amazon Forecast is another high-level service that lets you build time series prediction models from your own data center. So this is a very, very complicated problem, but I think Forecast makes it very simple. Just upload your time series data to S3 and then either use AutoML to select an algo or you can go and pick your favorite algo and tweak it if you know what you're doing. And then a model is trained and is deployed, and everything happens on Forecast's fully managed infrastructure. So Forecast is very nice. And Forecast will work from your data set, so your time series data, but you can also inject additional metadata. So if you're trying to forecast inventory sales, then you could add metadata information on the items themselves on top of just the stock or the sales value you want to predict. And another piece of metadata you can inject is whether the time of the year is a public holiday or not. And this is really useful because obviously this will massively impact the behavior of your model. If it's Christmas Day or if it's New Year's Day, these are really special days and so maybe they're high demand days or maybe they're low demand days depending on your business case. Telling the model that these days are special holidays and that the behavior of the model should be different is useful information. So there's actually a parameter for this when you create the predictor, so when you create the model itself, you can pass it a supplementary parameter which is down there. Yes, supplementary features. And for now, there's only one supported, and this is the list of... actually it's the country code you want to build a model for, and this will factor in the list of holidays for that specific country. So now we can support up to 30 countries, including France, which has lots of holidays. We never really work here. And so now you can just add that extra information, that extra metadata to your models. So that's pretty cool. Still on Forecast, we extended the DeepAR plus algo. So let me explain. Like I said, when you train a model on Forecast, you can either use AutoML and let Forecast pick the right algo for you, or you can pick the algo yourself. One of those algos is DeepAR plus, which is an Amazon-invented algo that was published. I will add the link in the video description. And DeepAR lets you build a model using a large number of time series. So if you want to train a single model on multiple time series, this is a good algo to use. And the basic idea here is using deep learning, DeepAR will extract hidden patterns that are present in your multiple time series. There is some relationship between those time series, and of course, the human eye cannot see them, but DeepAR will find those patterns and build a model accordingly. So this is a really, really important and powerful algo. So what did we add here? We added hyperparameters. The first one is one that lets you average multiple models over a single training. So kind of an ensemble technique, I guess, where you can train a number of models and then you can average the predictions from those models. And ensemble prediction is a powerful technique because the theory is that every model will make slightly different mistakes. So if you have multiple models predicting and then you average the result, you tend to average out the big mistakes. And the team of models does a better job than any single model would do. The second thing is the ability to change the learning rate over time. So that's something we're used to doing with deep learning models, scheduling the learning rate over epochs. And so you can do the same. You can have learning rate decay, so gradually decrease the learning rate over time to train more precisely over time. And the last one is an obscure one. And if you know exactly what this is, you probably don't need me to explain it. So there's a new likelihood function. And the likelihood function is basically the function that injects uncertainty in the prediction because time series are noisy and unpredictable. To factor that in, based on the distribution of your data, certain functions work better than others, and well, this is one of them. Okay, so if you can't sleep tonight, read about piecewise linear likelihood functions. Fascinating stuff. Again, if you know what you're doing, this is going to come in handy. Oh yes, we have deep running containers, one of my favorite topics. So by now, you should know we have a nice collection of deep running containers that package TensorFlow, PyTorch, MXNet, and a few more things. And they're off the shelf. You can grab them from Amazon ECR or Docker registry service. You can run them on your own machine, you can run them on container services, ECS, EKS, you can run them on EC2, and of course, you can run them on Amazon SageMaker. So basically, you know, catching up with the latest version, and I'm really happy to see that we have TensorFlow 2.1, which is the very latest version, until the next one, but we'll keep catching up. All right, that's it for the news. Now let's share some resources. I recorded a couple of videos that you might like. So the first one is actually a very popular request, and that's how do I use SageMaker on my local machine? Now, don't get me wrong, SageMaker is really about training and deploying at scale on fully managed infrastructure. But in the early stage of your project, when you're debugging your code, testing your code, you want to work locally, right? Because you just go faster, you iterate faster, you don't have to create managed infrastructure, you don't have to pay for it, you don't have to worry about any setup there. So this video will show you how to take existing code, existing notebooks, and just adapt them to run them on your local machine. And, well, you'll watch the video if you're interested, but in a nutshell, this means using an IAM role for your notebook, and it means having local data, although you could absolutely train on S3 data. You need local Docker on your machine because you're going to pull Docker containers to your local machine. And you need to set the SageMaker estimator to train on your local machine. And I think that's about it. So it's very simple. You can take any notebook and very, very easily adapt it and run it on your local machine. The only restriction here is that this will only work for frameworks, so TensorFlow and PyTorch, MXNet, Scikit-Learn, etc. And it will only work for your own containers. So if you're training with built-in algos like DeepAR or the other ones, it's not going to work because those containers are not available outside of AWS. But if you're using frameworks or you're on containers, this will absolutely work. So again, this is what took me so long. This is something that you've been asking for a long, long time, lots of you, so here it is. And I also gave, just a couple of days ago, an AWS webinar on SageMaker Studio, where I try to show you as much as I can in 47 minutes, going from the IDE to running notebooks, to hyperparameter optimization, to SageMaker Autopilot for AutoML, to SageMaker Model Debugger and SageMaker Model Monitor. It's a packed session, and well, I got good feedback on it and it's on YouTube. So now you can watch it too and learn about all the latest features. Okay, this is it for this episode. I hope you liked it. Don't forget to subscribe to my channel to be notified of future episodes, and I'll see you soon with more content. Until then, keep rocking.

Tags

Amazon TranscribePII RedactionAmazon RecognitionText Detection in VideosAmazon Forecast

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.