AWS AI ML Podcast Episode 20 pre Invent

Transcript

Hi everyone. This is Julien from Arcee. Welcome to episode 20 of my podcast. Don't forget to subscribe to be notified of future videos. As you probably guessed, this is a special episode focusing on everything that happened before reInvent. reInvent is just a few hours away. The keynotes will certainly announce some new services. I'm not going to talk about that, but I can give you a recap of all the major AI and ML announcements that took place in November so that you can catch up and be ready for the onslaught of new launches coming this week. Let's get started. First things first, Amazon SageMaker. The really good news is that it's now available in 22 AWS regions. So chances are you can use Studio in your local region now and use it with the data you're hosting there. In addition to this, you can now use multi-GPU built-in environments in Studio. So far, you could only use CPU and single-GPU local environments, so now if you have to work in Studio with heavy-duty deep learning models that you want to train in Studio, it's going to be much easier with those larger environments. Of course, you can still use managed infrastructure with the full scope of EC2 instances for full-scale training and deployment. But for local experimentation, you can get a little more power now. Let's move on to Deep Composer. You certainly remember Deep Composer from last reInvent. It's the combination of a musical keyboard and an AWS service that generates music based on a melody that you play. A new feature allows you to do this with a couple of different deep learning architectures, and we added a third one based on transformers. Transformers are a really powerful architecture, increasingly popular for natural language processing and generally machine learning problems that deal with sequences. Generating music notes is a sequence problem, so you can experiment and try out transformers in Deep Composer. To help you understand how these networks work, we also added what we call a learning capsule, which is a kind of tutorial introducing you to the technology and how it's different from the previous ones. Musicians, you can now start your transformers and make some music. Amazon Lex is our chatbot service and now it supports additional languages, which I know a lot of you were waiting for: French, Canadian French, Italian, German, Spanish, and Latin American Spanish. Now, if you already have English-speaking bots or English language bots in general, you can localize them for all these countries and languages. If you never worked with Lex because it only supported English, well, this is your chance. French developers, Italian developers, German developers, and everyone else, it's time to learn about Lex and build some cool voice-based apps. Another feature we added is context management. Before this, you would define the list of utterances that your bot would try to detect. In a conversation, certain utterances happen at the beginning, and as the conversation unfolds, you'd like to enable additional utterances, helping the bot focus on the current stage of the conversation. This is exactly what context management is—dynamically enabling utterances at specific points in the conversation. This will help you build higher quality bots and user experiences. Amazon Polly is our text-to-speech service. As you probably remember, it includes two different engines: the standard engine and the neural text-to-speech engine, which generates the waveform using a deep learning model. We added a new voice for Australian English and a new British voice for the newscaster style. The newscaster style applies the speaking style of a news anchor, something you might hear on TV news or radio news. Now, if you want your speeches to sound like the BBC or something close to that, you can give that British English newscaster voice a try. Moving on to Amazon Textract. Textract is our OCR service, and the big announcement in November is that it can now detect handwriting, which is great because a lot of documents, forms, etc., contain handwriting. Print information is good to detect, but handwritten information is even better. We can now do this, so go and give it a try. In addition, Textract now supports five new languages: Spanish, German, Italian, Portuguese, and French. You can detect more documents in local languages. And last but not least, Textract is now integrated with KMS. So you can specify your AWS encryption key that you want to use to output your transcriptions in S3. For some use cases, encryption is mandatory, and now it's available in Textract. Next is Translate. The Translate team has added 16 new languages: Armenian, Catalan, Gujarati, Haitian, Icelandic, Kinyarwanda, Kazakh, Lithuanian, Malayalam, Macedonian, Maltese, Mongolian, Sinhala, Telugu, Uzbek, and Welsh. I think we have a total of 71 languages supported in Translate, which means a crazy amount of language pairs. You can pretty much translate from any of these to any of these. The second feature I want to mention is Active Custom Translations, which allows you to customize translation without having to train a model, just by providing some sentence pairs. You can customize how Translate translates your text, a feature that has been requested a lot. Finally, we have a feature called "do not translate," which lets you tag specific parts of the input text that shouldn't be translated. If you have, for example, English quotes in French text, you can say, "Don't translate this bit; I want to maintain the original language." Amazon Transcribe is the companion service to Amazon Polly. Polly is text-to-speech, and Transcribe is speech-to-text. We can use it in two ways: batch mode to transcribe sound files uploaded to Amazon S3 or real-time transcription. We have additional language support for streaming with German, Italian, Brazilian Portuguese, Japanese, and Korean. You can now stream audio content that's in OGG or FLAC format. Originally, we just supported PCM, so lots of customers had to convert audio content from those formats to PCM. Now you can use them natively. FLAC is a really cool format, lossless, no compression, and very high quality. Let's do a quick demo of this. I want to try that German real-time transcription. My German is very rusty, so I used Translate to translate from English to German and now I'm going to try and transcribe this automatically. Let me open a new window here and bring up Transcribe. Hello together, my name is Julien and I live in France. Iceland is a great place for a visit, and I would like to return there soon. Okay, so my first name is wrong, but that's okay. Julien is just impossible for anyone to pick up unless you're French. The rest is pretty good. So congratulations, Transcribe. If you can understand my German accent, you can probably do exactly what you want. So perfect. Very good job. If you want to stay in touch, you can check the What's New page. Something tells me it's going to be quite busy in the next few hours and days with all the major announcements. And of course, we have plenty of blog posts coming for you. So check them out and get started with all the new services that are coming this week. Lots of really exciting stuff, but I'll be back to tell you a little more about it. Enjoy reInvent, have fun, learn a lot, stay safe in these crazy times, and keep rocking.

AWS AI ML Podcast Episode 20 pre Invent

Transcript

Tags

About the Author