ML Fridays India Start your AI ML projects right May 2021
May 28, 2021
Broadcasted live on 28/05/2021. More information on ML Fridays at https://pages.awscloud.com/ml-fridays.html
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
In this session, we’ll see how you can put your AI/ML project on the right track from the get-go. Applying common sense and proven best practices, we’ll discuss skills, tools, methods, and more. We’ll also look at several real-life projects built by AWS customers in different industries and startups.
Transcript
So in this session, we're going to focus on best practices, no-nonsense advice to put your AI ML projects on the right track. There's a lot of confusion and buzzwords, and it's really not easy to figure out how to get started, right? By the way, please feel free to ask all your questions. I'm trying to keep an eye on the chat, so I'll try to answer questions as we go. And obviously, we will have plenty of time after the session to go through questions. Okay? So, don't be shy. Ask everything you want, and we'll do our best to answer as many as we can.
Okay, let's get started now. The first question is: Is AI real? Is it a trend? Is it a buzzword? What's in it for IT practitioners? Does AI have a massive future in the IT industry? Will it grow? Will it keep going? Yes. This is my prediction. Machine learning is about prediction, so this is mine. If you have another question like that, insert another coin, as much as I do on this. I really don't know about 2030 or 2050 or 2070. I honestly don't think anyone knows anything. But people like to make those long-term predictions. To be honest, this is not really useful. The better question is how do we, the builders, developers, project managers, architects, and data scientists, know how to get there? How do we get from, "my organization wants to implement AI and machine learning," to having projects in production that run well, solve business problems, and make us happy?
IT has been around for a long time. I always find it a bit surprising that we tend to reinvent the wheel a lot. We tend to forget that our ancestors have been building IT projects for literally 70 years now. It's almost a person's life, right? A lifespan. So it's a long time. If we want to know the future, we should look at the past and see how we've collectively understood and implemented disruptive technologies. If you want to see how AI adoption will look, how good we will be at adopting it, it's useful to see how we've done in the past.
Some of you will remember the 2000s, when we were trying to build web-scale applications, websites, etc. The recollection I have from those days is that a lot of projects sunk and died an awful death because we had no idea what we were doing. A few years later, everyone rushed to add e-commerce to their websites. Again, many of those projects ended up underwater because it was so new, and we had no reference points. We didn't have a lot of best practices and were just trying to build the best we could. Unfortunately, a lot of the projects went bad.
Fast forward five years, and mobile commerce ended up pretty much the same. There was a lot of pain and frustration because of new tools, processes, technologies, business models, and everything. We couldn't get it done right. My favorite was big data. People rushed to spend tons of money on expensive Hadoop clusters, building teams around that, and piling up data all the way to the moon, but not really delivering business value. I made all those mistakes as well, so I got eaten by the shark again and again. Hopefully, you had a better experience with it.
The first conclusion is that Jaws is not really a movie about sharks; it's about managing IT projects. It's the terrifying truth about tech projects. We have confused stakeholders. Remember in the movie, the mayor and business leaders of the town want to keep the beaches open. They don't understand the danger and just want the business to move along. But it's not that easy. Business pressure to keep tourists coming to the island is intense. Business has to move on, and we can't spend too much time thinking or preparing. We need to keep business running. An unprepared team, like the sheriff and his deputies, is totally unprepared for shark attacks and how to deal with them. Inadequate tools, as seen in the movie, start rudimentary and don't go well. Improvised tactics, literally making things up as they go, and random acts of bravery eventually get the job done, but it's luck more than anything else. We've all been on projects where stakeholders don't know what they want, put a lot of pressure on teams, and dealing with new technology makes it hard to figure out how to build. Sometimes we get to the end, but it's frustrating, burning out long hours, and the quality isn't what it should be.
People will always tell you that new technology is different, it's the AI revolution, and old rules don't apply. A lot of people still keep saying that, and I don't want to point fingers. You can make up your own opinion on who they are. I think this is why we keep failing or delivering very painfully every time we deal with new technology—because we forget we've been there before. Even if you're junior in the IT industry or fresh out of school, you can work with people who've been there before and should pay attention to what they're saying. It's not just old people ranting; they've seen this stuff before, even if they've never done AI. There are certain ways of doing things right.
Hopefully, 2020 and beyond is not going to look like that. It's the whole point of this presentation: to look at best practices and ideas that will hopefully save you from the shark. Insanity is doing the same thing over and over again and expecting different results. A lot of people say Albert Einstein said this, but it could have been Mark Twain. Either way, both are geniuses, and whoever said it first is absolutely right. We're all tired of being shark food, right? We want to move away from those negative things on the left. We're tired of that. AI and machine learning is a new cycle in the IT industry, a chance to get away from shark bait and do it better.
Instead of having confused stakeholders, we want to set expectations. We want to replace business pressure with clear metrics showing progress, incremental progress. We don't want to be unprepared. Maybe we don't have all the skills, but we need to know where we are. We need to assess the skills and then pick the right tools for the job according to those skills. We want to use best practices. Traditional best practices for software engineering also apply to ML projects. Instead of running around fighting randomly, we want a clear methodology: iterate, iterate, iterate.
Let's look at all these points one by one. The first one is critical: setting expectations. What is this project about? What are we trying to deliver? You'd be surprised. No offense to our customers, but I still talk to a lot of people who are not quite sure what they're trying to achieve. They want to invest in machine learning for good reasons, but they're not super clear on the question. It's not that easy as it seems. It should be very crisp, literally one sentence on the whiteboard. Nothing more. It shouldn't be five pages of text or a 60-slide presentation. It should be simple, a clear sentence that everyone in the company can understand. It needs to be quantifiable. If your statement is, "We'd like to improve customer retention," well, yes, but how much? 5%, 60%? Clear goals, clear goals.
Machine learning is about data. It's not about whether we will build a model or not. We may not need to build a model; we may reuse models. But what's the data like? Do we have enough data? Do we have any data at all? This is an important fact. Everyone has data. You'll say, "Oh, we have databases, backends, S3, all that good stuff. Tons of data." But how good is it? How relevant to the problem is it? It's not about quantity; it's about quality, making it better over time, curating it. If you don't have enough data or not enough clean data, what's the cost of getting more? Sometimes it's as easy as collecting more weblogs; sometimes it's as hard as labeling complex pictures for computer vision applications.
Another important point is involving everyone early on. Machine learning is too serious to be left to data scientists. It's about solving business problems. You need business stakeholders to educate them on how you're going to build the project and what kind of results they can expect. Setting expectations. You need domain experts to help you understand the problem. Sometimes the problem is straightforward enough that engineers and developers can figure it out. But if you have complex problems in healthcare, chemistry, or finance, you could be a very strong software engineer, but you don't know all the finer points of the domain. You definitely need help here. Of course, you'll need IT because they'll be building, deploying, and monitoring your apps. And data science, if you have a data science team, etc. Everyone around the same table trying to figure out the problem, the best question to answer, and a good metric.
Here are some examples of terrible whiteboard sentences. "We want to see what the technology can do for us." Red flag, alarm, awful. This is guaranteed disaster because you'll be fooling around with software and servers and won't build anything. Even if you build a POC, what is it good for? It's a toy example. You need a solid business problem to work on. "We have tons of relational data. Surely you can do something with it." Yes, maybe, maybe not. It's not about the data; it's about the business question. Data is just here to help you answer the question. "I read this cool article about FUBAR ML. We ought to try it." This is terrible too because it's not about tools or having fun with expensive toys. It's about answering business problems. If you find yourself getting that as a tool, stand your ground, say no, and dive deeper into the problem to figure out the business question. Why are we even embarking on this project? I cannot overstate how important this is. This is the number one mistake people make.
Once you have a rough business question, or maybe three or four, you need to define metrics. What is the business metric showing success? This is important not only to show success to the team making progress but also to show success to the company and business stakeholders, saying, "Hey, machine learning is actually quite good at solving this problem. We had a positive impact." Technical metrics are nice, but it's about improving business outcomes. It's also very important to understand the baseline. A lot of the time, you'll be improving or sometimes replacing an existing system, which could be human-based or another IT application. You want to do better than the existing process. It's important to understand where you are: what's working well, what's not working well. Once you understand the baseline, you need to understand what's a reasonable but still significant improvement. This can widely vary.
For example, if you want to automatically classify common support tickets using natural language processing and assign tickets to the right support representative based on the topic, you could say, "We have a human baseline, and it's a little bit random because people don't understand the domain very well. Maybe we only get 70% or 80% accuracy on assigning to the right team or person." You could say, "We want to do better than that." If your baseline is 80%, 81% is not a reasonable improvement. It's not enough to justify launching a project. 99% is not a reasonable improvement because you won't get to 99% short term. Find something impactful for the company that shows a positive outcome but is still reasonable. You could say, "We want to improve ticket classification by 5% every quarter." Start, get some results, and keep improving.
There are some red flags to be mindful of. Machine learning is just like every other domain, with a lot of jargon. It's easy to define business or machine learning metrics that are super difficult to relate to. For example, coming back to the support ticket classification, your data science colleagues might say, "The confusion metrics have significantly improved." If you're into machine learning, you know what that means. If you're a software engineer, it's not obvious. If you're a business person, you're completely confused. Is that good? Is that bad? Confusion metrics? How could something called confusion be good? You could say, "Sorry, I didn't understand what you said." They might say, "P90 time to resolution is now under 24 hours." Is that good? Is that bad? They might say, "Misclassified emails have gone down 5.3% using the latest model." Now, you're starting to relate. We want to classify those support emails correctly. If we do a better job, that's interesting. But at the end of the day, you want to hear something like, "The latest support survey shows that very happy customers are up 9.2%." Now, that's really good because you see business impact. Maybe that's your metric, or one of the metrics you're keeping an eye on. That's good. It's what you want to see: happy customers who get their problems solved quickly so they can enjoy the service or product they bought. Pay attention to that. It's easy for tech teams to come up with tech metrics that are not necessarily useless but don't tell the business story. Make sure to have those metrics as well.
The next step is figuring out what you need and what your skills are when it comes to building the project. It's about assessing needs, not wants. I'm an engineer. I love to play with technology and get my hands dirty with all sorts of tools. But is this the right thing to do for the project and the company? Am I shooting myself in the foot by using tech that's too complicated or over-engineered for the problem? Ask yourself, and be super honest. We understand the business problem and the metrics. Can we build a data set describing the problem? Do we have the data? Is it costly to get more if we need more? Do we have the big data or just the data platform to quickly and efficiently clean, curate that data, and build data sets? Is this something we're good at and need for this problem?
For example, if you want to do computer vision on medical images, detecting conditions on medical images, this is a very specific problem. It's not something that's likely to be available off the shelf or that AWS services can do out of the box. You need to collect images from hospitals and medical sources, prepare those images, and keep them anonymized. There are so many problems, but you need to do that because it's very specific data. If you want to build a fun application for children where they upload animal images and you recognize what animal it is, giving them fun facts and educational facts on animals, you could say, "That's got to be available somewhere. Why should I take thousands of pictures of thousands of different animal species?" You need to understand how specific the problem is, how readily available data could be, and what it would look like on your end to work with that data and how much work would be involved.
Once you have a sense of the data you need or don't need, ask yourself if you can write and tune machine learning algorithms. Is this something you know how to do? Is this something you should be doing for that problem? It's not about doing it because it's fun and looks good on your resume. What looks good on your resume is delivering projects that create business value. You could have a long list of technologies, but if you never had a successful project, it doesn't look good. It's about showing business impact. Do you have to do it? If you want to recognize everyday life images or do sentiment analysis on everyday natural language, do you really think it's worth writing your own algo? Those problems are already solved, and there should be limited machine learning work here. If you want to do complex image processing on industrial parts, mechanical parts, or semiconductor wafers, or satellite images, maybe here you need to build a custom solution. Be critical, do your homework, do your research, and find the quickest route to success.
Infrastructure is crucial because data, training models, deploying models, all require infrastructure. Do you need to do it? Do you want to do it? No, you don't want to do it. Trust me, you don't want to manage servers. No company ever wants customers because they have the best-run Docker cluster or the most amazing EC2 instances. It's not about that. It needs to run, scale, and be fast, but it's not the core problem. The core problem is understanding data to answer the business question and building a model that does that right. Infrastructure, not so much. It's a whole spectrum of solutions. On the left, you could say, "I need a fully managed solution. My problem is generic enough, and it's a well-understood problem. I can probably find off-the-shelf APIs or models, and I definitely shouldn't be managing infrastructure." On the right, you might have a crazy, innovative, and totally new problem. You probably need your data set and algorithms, so you need to find that balance. Be super honest about what the business needs are and then what the technical needs are, and match that with your skills.
Once you know that, pick the best tool for the job, which is obvious. I'm sure you've seen this before. It used to be called the Iron Triangle of Project Management. Here's my version for machine learning: cost, time to market, or accuracy—you get to pick two. For example, if you want a cost-effective and fast option, it's probably not going to be the most accurate. If you want to go fast, build your POC, show some business value to your stakeholders early on, then you need to do that. It's enough to get started, but maybe it's not good enough for production. But at least you can discover the problem and learn more about it. One important thing to know is that improving accuracy will take increasingly more time and money. It's about diminishing returns. It's reasonably easy to get 80% accuracy, more work to get to 90%, and a lot of work to get to 95%. Anything beyond that, every decimal is increasingly difficult and expensive. Some problems won't require 99% accuracy, especially when you look at the associated costs. It's about understanding when to stop and when it stops making sense to improve accuracy with respect to the amount of time and resources invested. It's a gray line, and you need to figure it out.
Another problem I see sometimes is people getting super excited about state-of-the-art models. They read blog posts and research papers and think, "We can use this. It's amazing and this crazy new model from whoever." I would give you a word of caution because state-of-the-art is amazing, but it's hard to work with, understand, and can be complex and costly to tune and live with. Focus on what I call actionable state-of-the-art—stuff you can really use daily. Techniques like transfer learning with pre-trained models, AutoML, are reasonable and actionable tools. Be wary of super fancy models that are probably overkill and too complicated unless you have very strong machine learning and data science skills.
Best practices are critical. Things are not different this time. AI ML is software engineering. It's not something that lives in an ivory tower or a dungeon where it's a totally different world and none of the best practices apply. Unfortunately, I hear this a lot, although it's been improving lately, probably because more developers are getting into machine learning and bringing in all the knowledge and best practices they've learned over time. Dev environment, test environments, QA, documentation, agile methods, versioning, etc., are all good things that help deliver high-quality projects. It's not okay to say, "Oh, well, give me the data, and I'll talk to you in six months, and maybe I will have a model." Even if they do, you get a black box model, and you have no idea how it works. If you're lucky, you get some results and say, "On my test set, this performs very well." But what about real-life data? It can't work like that. You need to standardize workflows. Machine learning is still a fairly new field with lots of different tools and ways to build and deploy models. Standardizing as you go, as your ML practice grows, is really important, just like standardizing development workflows over time. Onboard all teams. This is not just about data science and machine learning; it's about embedding models into your IT applications, deploying them, monitoring them, measuring their business impact, etc. It's about IT in general, not just data science.
A very important thing is that a lot of machine learning projects tend to be tested in sandboxes or with test data sets or A-B testing. It's all good and an important starting point, but the truth is in production. This is shark hunting production. The barrel ID might have looked good on paper or in the harbor, but when you're in the ocean and actually start testing it, how did that go? Not so well. The same goes for machine learning models. You need to get them in production early and as often as needed because you need to evaluate those models on real-life data. Real-life data is always different from the data in your training and test data sets. Real-life data is never clean and never exactly right, and there are so many things that can go wrong. You need to figure them out as early and as often as needed to keep your models operating at high quality. To do that, you need continuous integration, continuous deployment, and automation as soon as you can start building those. Generally, all DevOps practices for machine learning are good. If people want to call that MLOps, that's okay, but it's really DevOps all over again. If you've done that before, good. If you're a DevOps engineer, you have a very nice career path into ML. Just learn how to deploy machine learning models and how to monitor them, and you can transition into ML very easily. MLOps is a super hot topic right now because it's still very new, and people realize how important it is to get that stuff right.
Can we get to another Aditya, can you bring in the poll now? Folks, we will wait for 20 seconds for you all to respond to this poll, and then Julien will continue post this poll. That's a very good question. It's difficult because you want to tick all the boxes. If you ask your boss or give that poll to your stakeholders, they will tick all the boxes. We want all of that. It's your job to say, "I'm sorry, my friends, but it's not that easy. I wish we could do that." Over time, you learn how to get better and do better on those three things. But early on, you have to pick your battles. That's precisely why we thought about this particular poll. What we've learned working with customers across the globe is that the answer to this question will differ from use case to use case. You cannot have an org-level answer for this question; it has to be at the use case level. That's the thing. There are no hard rules here.
For example, if you're trying to detect early cancer, accuracy is paramount. You don't want to tell people they have a high chance of having cancer if they're perfectly okay, and it's even worse to tell someone they're absolutely fine when they are actually ill. In life-critical scenarios like autonomous driving and healthcare, accuracy is crucial. In the current situation, how quickly the vaccine can reach healthcare centers is very important. For startups, there's a first-mover advantage in new markets, so you might say, "I want to deploy something next month because we have competition and want to move faster and collect customer feedback faster." Time to market is super important. Accuracy is not so important in the short term. Again, this is why those questions are generally tricky. I'm very wary when people come with pre-built answers to everything. I much prefer to help people realize what the questions are, what they should be looking at, and then they come up with the answer that's right for them. They know their business, and you have to be very humble here and help people understand what they should be thinking about. A lot of people think accuracy is the most important, and I would agree. No one wants to build models that don't predict right. But as you iterate, those things could differ. Initially, you might want time to market, and in the first few iterations, then work on accuracy a little more.
The last thing is really how, once you've got all those things figured out, you go and deal with the project. Iterate, iterate, iterate, also known as Boyd's Law, not new, from 1960. Even I wasn't born. It's about the speed of iteration, beating. Keep it simple; simple usually wins.
Once you have a model with interesting accuracy, start running A-B tests or real-life data tests. Capture real-life data, inject it into your model, and see how it performs. Talk to domain experts to understand prediction errors and identify missing features in your dataset. Observe prediction errors regularly and decide if you need to fix the dataset, collect more data, tweak the algorithm, or try a different approach. Iterate until accuracy gains become irrelevant to the business problem and the costs of further improvement outweigh the benefits.
The machine learning lifecycle starts with the business problem. Frame the problem, gather and prepare data, engineer features, train and tune models, and measure accuracy. If the initial accuracy is not sufficient, go back to the drawing board and improve. Once you achieve the desired accuracy, deploy the model, serve predictions, and monitor performance. Over time, retrain the model to account for new data. This process involves multiple cycles and requires a cross-team effort, including business stakeholders, domain experts, data engineers, data scientists, machine learning engineers, and IT.
Amazon has been using machine learning for over 20 years. Recommendations on Amazon.com account for about 30% of page views, and logistics, Echo devices, and Prime Air all rely heavily on machine learning. AWS offers a machine learning stack with three layers: AI services, SageMaker, and foundational building blocks. AI services are the easiest to use, providing pre-trained models and APIs for tasks like image and video analysis, speech-to-text, and natural language processing. SageMaker allows you to bring your own code and manage the entire machine learning lifecycle without worrying about infrastructure. Foundational building blocks include optimized machine learning frameworks and EC2 instances for those who want more control.
Over 100,000 customers use machine learning on AWS, ranging from large enterprises to startups, NGOs, and universities. Common use cases include improving customer experience, making business operations smoother, and driving innovation. For example, media intelligence involves extracting insights from images and videos, and document processing involves digitizing and analyzing documents. Predictive maintenance, fraud detection, and forecasting are also popular use cases.
Coinbase uses SageMaker for fraud detection, specifically for authenticating ID documents to prevent fake accounts. PayU, a leading payment gateway in India, uses SageMaker for credit scoring to decide whether to extend credit to customers without traditional credit ratings. These examples illustrate the diverse applications of machine learning across industries.
For predictive maintenance, AWS offers services like Monitron, which includes sensors and a gateway to monitor equipment and detect anomalies. Lookout for Equipment allows you to use your own sensors and data for similar purposes. For more specific needs, you can use SageMaker to build custom models. The key is to start simple, use AI services where possible, and tackle more complex problems with SageMaker.
When dealing with large datasets, AWS provides a range of data services like EMR for data preparation and Glue for ETL workflows. Choose the tools that align with your team's skills and preferences to effectively mine and analyze data. This approach ensures you can efficiently identify important parameters and correlations within your data. It's better to use the right tool for the job and combine tools. Focusing on SageMaker, there's a capability called SageMaker Processing, which lets you run batch jobs on managed infrastructure. It's very simple, just a few lines of code, and you can bring your own code in Python and PySpark. If you use PySpark, you can do distributed processing automatically. This is the simplest option because you don't need to build EMR clusters. You can apply your feature engineering code, and so on. You can also run Jupyter notebooks connected to an EMR cluster for interactive exploration. These are probably the three things I would look at: EMR and Spark if that's your technical culture, Glue for completely automated end-to-end processing workflows, and SageMaker Processing if you prefer to do it within the SageMaker environment, especially if you want to automate the workflow completely, from data preparation to training and deploying on SageMaker. We have a service called SageMaker Pipelines that lets you build these automated pipelines as well.
But again, all three are a good starting point. Use what you know, build on your existing skills, and then figure out if you need to switch to something different. Be pragmatic about this. One thing I love is being aware of the capabilities within your organization because there's a lot of information flowing around in the industry. You need to be cognizant of what is most relevant for your ecosystem and set your goals accordingly. It's very important not to get blown away by what you read or see in the media. Don't just listen to what you hear. Come up with your own conclusions, do your own homework, run your own tests, and factor in your skills and business environment. It's not because you read a blog post or even listened to me that you should follow it blindly. Think for yourselves and build a solution that works in your context, considering business, technical, skills, and cost factors. You need to experiment and figure out what works best for you. This is a good learning in IT in general and certainly applies to machine learning as well.
So with this, we have come to the close of today's ML Fridays session. Julien, thanks a lot for sharing your knowledge with the participants who joined today. I really thank all the participants who took time from their schedules to invest one and a half hours listening to us and how we look at the overall ecosystem and how you should be building your ML ecosystem. I would request you all to please fill out the feedback form. If you have any further questions or want to explore how to implement machine learning within your organization, feel free to reach out to us via the emails you received for this webinar. We'll be more than happy to engage with you and help you on your machine learning journey. So with this, once again, thanks a lot, and I'll close the seminar for the day. Thanks a lot, everyone. Have a good day. Bye-bye.
Julien Simon is the Chief Evangelist at Arcee AI
, specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.
Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.