Amazon SageMaker Ground Truth Creating a workforce part 1

December 17, 2019
In this video, I introduce you to Amazon SageMaker Ground Truth, a fully managed data labeling service, and I show you how to create a private workforce. https://aws.amazon.com/sagemaker/groundtruth/ https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-build-highly-accurate-datasets-and-reduce-labeling-costs-by-up-to-70/ Follow me on : * Medium: https://medium.com/@julsimon * Twitter: https://twitter.com/juliensimon

Transcript

Hey, Julien from Arcee here. In this video, I'd like to show you how to get started with Amazon SageMaker Ground Truth, our dataset labeling service. The first step is to go to the SageMaker console, and the first thing we want to do is define the workforce for the labeling job. A workforce is a group of people who will work on data annotation. You can create three types of workforces. The first one is the private workforce, which is the one I'm going to use. A private workforce consists of people you know, such as people from your company, whom you can identify with email, etc. You can also work with a vendor workforce, which is a workforce made up of people working for a third-party company that we've approved and integrated on the platform. You can find more about available vendors in the AWS Marketplace. The third option is to use Amazon Mechanical Turk, which is the one to use if you need to scale to thousands or even tens of thousands of workers because you have a very large dataset to work on. Here, I'm going to create a private workforce. Let's just click on this, give it a name, and invite workers by email or import them from a Cognito pool. Cognito is one of our authentication services. In a real-life, enterprise setting, you will probably already have users managed by Cognito, so that's probably the one you'd want to use. But here, I'm going to keep it simple and invite myself by email, provide an org name, and a contact email. I'll put my email again; this contact email is where workers would send questions, which is really important, especially when starting to work on a new dataset. You'll provide instructions to workers, and it's crucial that they can ask questions and provide feedback. Instructions might not be 100% clear, or some data samples might be ambiguous or weird, and they don't know how to label those. This feedback will help improve the quality of your labeling jobs. We could have an SNS topic to notify workers that work is available, but let's not do that here. Just create the private team. As you can guess, this will send me an email that I need to approve. I can see here that the invitation was sent, and I need to change my password. Let me do that offline and I'll be back in a minute. Okay, I received an email from Ground Truth with a temporary password and a login URL. Let's sign in and change the password. Now, we have to come up with a strong password. Let's see if we can get this right. This is my worker console, and here I don't see any available work. Let's leave this window open. Now I'm registered as a worker. If I go back and reload this page, it should show that I am verified. Yes, it shows I am verified, and my team is ready to get some work done. Let's move on to the next step. In the next video, I'm going to show you how to create a labeling job. See you in a minute.

Tags

Amazon SageMakerGround TruthPrivate WorkforceData LabelingAWS Marketplace

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.