SageMaker Fridays Season 4 Episode 3 Detecting cancer cells in medical images

August 24, 2021
Broadcasted on 20/08/2021. Join us for more episodes at https://pages.awscloud.com/SageMakerFridays ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ In this episode, we'll show you how to build a cancer cell detection model using medical images. *** Notebook https://github.com/aws/amazon-sagemaker-examples/tree/master/use-cases/computer_vision

Transcript

Hi everybody and welcome to this new episode of SageMaker Fridays, Season 4. My name is Julien and I'm a Principal Developer Advocate focusing on AI and Machine Learning. Once again, please meet my co-presenter. Hi everyone, my name is Ségolène and I'm a Senior Data Scientist working with AWS Machine Learning Solution Lab. My role is to help customers get their ML project on the right track in order to create business value as fast as possible. All right, great. Thanks again for your time. Thanks for helping us prepare this. So this is, once again, a long demo, right? And we have a pretty cool use case today. If you have questions, please ask all your questions. We have friendly and expert moderators who are waiting to help. So please ask all your questions and there are no silly questions. Don't be shy and make sure you learn as much as possible. Okay, that's our only purpose today. So Ségolène, what are we doing today? Last week and the week before, we covered recommendation and fraud detection. I think today we are zooming in a little bit on healthcare and life sciences. So tell us more. Yes, so this week we are going to work on classifying medical images in order to detect cancerous cells. Okay, so pretty serious topic. Pretty serious data sets. And of course, healthcare and generally life sciences are areas where AI and machine learning are making a big difference. Exactly. So it's really cool to get a chance to talk about that today. This is the notebook we're going to use on GitHub. So go and take a look. We'll show it again at the end of the session. So I don't worry if you didn't have time to catch this one. Let's dive into the demo and the use case right away. So let me show the actual notebook. So Ségolène, what are we doing today? We're using medical images and we try to predict whether there's a problem in there, but tell us more. Yeah, exactly. So we are going to use some images, medical images, and we are going to apply a deep learning model on top of these data sets. We are going to be able to detect if there is a metastasis or not in the image. The data set we are going to use comes from the Camelyon16 challenge. The raw data provided by the challenge has been processed into 96-pixel tiles. The original dataset is over 6 gigabytes of data. Okay, so that's pretty big. But in order to easily run this demo, the dataset has been pruned to the first 14,000 images. And of course, it comes included in the repo with this notebook for convenience. We can see some images here on screen. Don't worry, we'll look at the notebook in detail, but just to give you a clear picture, we're trying to learn if those pictures show normal cells or if they show cancer cells. Exactly. And these color images were extracted from histopathologic scans of lymph node sections. As you can see here, each image is annotated with a binary label indicating the presence of metastatic tissue. So level one, metastasis, level zero. Probably bad news, unfortunately. And level zero, you're good. So hoping we have lots of zeros, but unfortunately, not always the case. These are small images, by the way. Yeah, you mentioned 96 by 96. So not a lot of pixels. But still, let's see how well we can do. And of course, working with small images is always better when it comes to training time and how much compute you need. Intuitively, you would think, oh, we need high-res, 4K by 4K images. But for some applications, yes, you do need that. Satellite images, very high-res images. But in fact, for most computer vision applications, we can actually get good results with small images and get them quicker. So again, let's see how well we're doing here. So what about, let me go back to maybe the first few cells here. We import some libraries. We download the data set. As you can see, this extract is about 350 megs, 69 megs, actually. So not huge, right? We download it locally, we extract it. So we see exactly what you said, 14K images, 96 by 96. And of course, three colors, three channels, because red, green, blue, right? So that's the shape of that data. And of course, we have the labels, zero or one. Okay. And then we upload that stuff to S3, right? And we delete the local copy just to save space. But of course, you can explore those images locally if you'd like. So in fact, here, we don't do any data prep. But let's step back. Because of course, here the images are already labeled. In real life, it's not always the case. So let's imagine we work for a hospital or a research lab, we would get plenty of patient images and then doctors or researchers would actually look at those and say they're okay or no, unfortunately, they're not okay, they show cancer cells, and so they need to be treated as such. Here we have 14,000 images, which is a small dataset for computer vision, but if we had to label all of them ourselves, that's a lot. Who wants to look at 14,000 images and say 010? It's a lot of work. So how can we help customers if they have to do that, if they're working with their own data and they have to label it? Is there a service we can use? Yeah, I think we should use SageMaker Ground Truth. So let's jump just for a second to the SageMaker console, and we see this capability called Ground Truth. And Ground Truth is, let's just go and enter this one. It's been specifically built to help you label images. We're not going to do the demo right now because there's a lot of stuff we want to cover today and we like to focus more on training and tuning the model. We'll actually do that later. Probably we'll show you that in a future episode when we revisit the operations aspect of machine learning. And I guess we'll go back to this example. So really quickly, the 30-second whirlwind tour of Ground Truth would be you can create a job. And as you would expect, your data needs to be in S3 or a storage service. So you would upload your images, your raw images to S3. And then you can create the actual job. We have some built-in task types. We have images and text and video and text. And yeah, video, why not? And we can also do point clouds, which is really important if you're doing complex things like autonomous drive. 3D point cloud, super cool. But not our purpose today. When it comes to images, we can do image classification. So we can create a task where a team of workers that we create could be people, employees, a private team, employees of the company, or labeling partners. So companies that are AWS partners specialized in labeling. And it could also be Amazon Mechanical Turk if you needed to scale your labeling efforts to maybe millions. In this case, you probably need experts. Expert labelers, right? Because it's not as simple as saying, well, this is basketball or soccer, or yes, there's a human in the picture. This is really very complex stuff, and you don't want to label them wrong. The consequence could be bad. So in this case, you would have a team of doctors or medical students, all the experts you could point, and you create that team and you distribute the work. They would go and visualize individual pictures and label them. In the case of classification, you can apply single label, multi-label. So if you're looking for multiple pathologies, you could actually say, I'm seeing this and this and that. You could do bounding boxes if you wanted to do tumor detection, for example, just flagging it so this is the bit in the picture where it's not good. You could do segmentation if you wanted to actually identify all pixels of a particular area, etc. So super simple. We'll probably do a demo in a future episode, but that's the kind of tool you would use for labeling. You just need to define your team, upload your data, define the task type, provide some basic instructions. And then all the labeling team gets access to a portal where they actually start seeing images and labeling them. And then you get that output in S3. And that's the label dataset you can use. Okay. So yeah, very cool thing. But we'll probably come back to that, okay? SageMaker Ground Truth. So we don't need to do this here. We have the labeled images and we don't really need to do data prep. So what we probably need to do is to split, right? To split into training and validation. So we just do that pretty easily using this API. So we have 11,000 images for training, 2,000 for validation, and we keep a test set for benchmarking, 1,000. Okay, so now we have those three datasets. I guess the next question is, how do we feed those images to the algo that we're using? We could very well pass them as a file tree, so to speak, an object tree in S3. So that's possible. But it's a lot of files, right? And for convenience and performance reasons, we don't want to have to deal with thousands of files. And again, this is a small dataset. Imagine millions of pages, right? Copying, moving them around, making sure none is missing. The more you have, the more painful it is. Exactly. So one step we're going to take here is we're actually going to pack those images into a single file, single file per dataset. So we'll have one large file for training with all the images in there, same for validation, same for testing. And that makes it more convenient to move around, to archive version, if you're working with different versions of your data. And it makes it easier and more efficient to send to the training instances and maybe to split if you use distributed training, which we might be doing later. If you have one big file, you can split it in chunks and send the chunks to the different training instances instead of sending individual files where you get a lot of IO overhead, working with thousands, potentially millions of files. So this is what we're doing here. And so we're using a file format called RecordIO, which is actually part of Apache MXNet. There's an equivalent format in TensorFlow. If you're familiar with TensorFlow, you've certainly seen TFRecord files, same story. We pack small instances into one single file. So that's what we're doing here. We're reading all our images in each dataset, as we can see here, training, validation, and test. And we write them in RecordIO format into a single file. We can apply extra quality, compression. If we have really large images, we can shrink them, etc. And we save images in JPEG format. So it looks a little complex, but that's really what we're doing. Reading images, writing them into a record-structured file. And actually, we get two files. Once we've done that, we can certainly see them here. So for example, we see that we have an IDX file, which is an index file. This is actually a text file that gives you the offset of each image inside the bigger RecordIO file. So that's simple. We see all our training images. If we want to go and grab one particular image, we know where it sits. And of course, the RecordIO file is a binary file with all those images packed into it. So this is generic code. As long as you're feeding it Numpy arrays, it works. NDArray is just the equivalent of a Numpy array in MXNet. But that's pretty compatible. So you can reuse that code as is, I think. Okay. So we do this three times. So now we have our three RecordIO files. We upload them to S3. Of course, we have labels, but that's obviously we have the labels in there as well. And now we're ready. Compared to previous episodes, it's, you know, we don't do data prep, we don't do feature engineering. We just package the images to make them easier to work with. So let's get to the really important stuff now. So which algo are we going to use here? As you see here in the code, we are going to use a built-in algorithm of SageMaker, which is the image classification algorithm. This built-in algo is a supervised learning algo that supports multilevel classification. It's called the image classification algorithm. It's a good name. It's going to use a convolutional neural network, specifically ResNet, which can be either trained from scratch or trained using transfer learning technique, especially when a large number of training images are not available. So, yes, it can be run in two modes: full training and transfer learning, and we will see that. In full training mode, the network is initialized with random weights and trained on user data from scratch. In transfer learning mode, the network is initialized with pre-trained weights and just the top pre-cognitive layer is initialized with random weights. So we're using ResNet, which is a well-known, efficient image classifier. It's been around for a few years now. It works very well. People understand it pretty well. But in this case, do we train from scratch or do we start from pre-trained users? Because we have two classes, right? 0 and 1, and we have 14,000 images. So I haven't checked how balanced they are. But we have thousands of samples for each class. And the images are not really big. They're not super, super complex. So in this case, we will be starting from a pre-trained model and see if it works. Because if we look at the images once again, we see basic geometrical shapes. We see circles, we see diagonals. Obviously, there's complexity in there. But you could say, well, medical images, these are really, really different from everyday life images that you would find in the ImageNet dataset, which is usually what we use for pre-training. But again, the basic patterns that we see here, like diagonals, circles, ovals, textures, color gradients, these are probably learned by the pre-trained model. So it's a portal shot. Let's do transfer learning. Let's just train for a small number of epochs. Let's see what kind of accuracy we get. If it's not too good, then I guess we could start from scratch. And again, for demo purposes, we have a small number of images. But if we use millions of images, maybe we could train from scratch. But it's a balance. It's a balance between training time and everything. So here we're going to try. So let's go back to this. And so we use that image classification algo. As you mentioned, it's ResNet. It's built-in. So we just retrieve the container because training and for that matter, deployment, activity in SageMaker is based on Docker containers. But in this case, it's already there. So the only thing we need to do is grab its name. We need to find the number of training samples that we have, the number of classes, because obviously these are hyperparameters. So let's take a look at those hyperparameters. The first one is the number of layers. And of course, they are available in the doc. So you can go and read them, right? Some are required. Some are optional. So my advice, always the same. Start with only the required ones, get your baseline, and then start tweaking. And then if you don't want to tweak, use model tuning. But we'll see that later. So what did we set here? So 50 layers for ResNet. ResNet 50. It's kind of the mid-size model. You could do 18, I think, is the smallest. And the biggest is 152, I think. Let's check. Number of layers, number of layers. Is it the first one? No, it's not going to be the first one. Come on. I missed, yeah. 50, 152, or 200. Okay, yeah. For transfer learning, we can do from 18 to 200. 200. Okay. And for data with small image size, suggest selecting 20, 32, 44, 56, 100. Okay, fine. So we're doing transfer learning. So we're kind of in the middle. 18 is going to train much faster, but accuracy could be lower because you have fewer parameters to learn. 200 is going to be much lower, but potentially gives you better accuracy. So let's go with 50, which is a reasonable first attempt. Let's see what we get. To create a baseline. Yeah, for now, we're trying to get a baseline and then see if we can use classification on this problem. We're trying to get a quick sense of whether there is a machine learning solution to this. And if we get good accuracy, then we can go crazy with your model and iterate again. And iterate on larger networks. So we use a pre-trained model. Yes, I said one. So zero means start from scratch. One means weights are pre-initialized. Augmentation type. So tell us about that. What does that do? This hyperparameter in the image classification algorithm is super interesting because the input images can be augmented in multiple ways. What does it mean here, crop color transform? You're going to have some random transformations such as rotation, shear, etc. So we're actually creating new samples that are weakly distorted, altered. Exactly, and this will help the algorithm to generalize better, especially since we are dealing with medical images. I think it's a good idea to have this one here because you can see color lighting, and whether the area of interest is not dead in the center. Depending on how that picture was taken, the area of interest could be somewhere in the image. They come from machines, etc. Maybe they've been cropped, maybe that's the interesting bit, or maybe that's the interesting bit. And it's not always going to be nicely in the center for you to examine. So image augmentation is a good way to create slightly more hostile samples that help your model learn better. You make the model train harder, right? Okay. So good one. Image shape, we've already discussed. Number of classes is pretty obvious. Number of training samples, we already know. And then we have a few machine learning parameters. Batch size, so we have four, how many do we have in the training set? 11,000. Okay, so 64, that's fine. It gives us hundreds of batches per epoch. Starting point. But what if you use 32 or 128? Epochs, you know, five, which is a reasonable low number. We're doing transfer learning, so we shouldn't be training for longer than, maybe 10 is the max, but just a handful of epochs. Learning rate, the hardest of all parameters to pick, so we'll stick with this, which I think is actually the default value in the doc. Learning rate. I think it's 0.1. Okay. So we actually fly a slightly lower learning rate, which makes sense for transfer learning, right? You don't want to make huge adjustments. You just want to specialize. And precision type, so we're training with 32-bit weights for maximum precision. We could speed up training and reduce model size with 16-bit, but again, not a huge concern. And again, there are plenty of other parameters, but these are the ones that are really important. So now we need to configure the training job. So, and we're going to use that estimator object. If you've seen previous episodes or if you're already working with SageMaker, you know it's really central. Here we're using the generic estimator for built-in algorithms, but we have specialized estimators for TensorFlow, PyTorch, etc. So we pass hyperparameters. We pass the location of the training container, but we found the role is for permissions, access to S3, access to the container, etc. Infrastructure requirements. So here we are doing distributed training because why not? So we're using two single GPU instances. P3 is a GPU, and 2XL is the entry-level size. So they have one GPU, but we have multi-GPU instances as well. So two instances will collaborate. Volume size is how much storage each training instance gets. By default, you only get five gigs, which might be a little small, so I don't think we need 100, but okay. It's just for safety. Because we need to copy the dataset. We need to save the model. You know, we could be using checkpointing. So we have different checkpoints for the model, etc. 100 on the series is probably a little too much. Max run is the maximum training time. So 10 hours. Again, that's... We're not going to need this. And then the output path. Where do we save them all? So, algo hyperparameters, estimator parameters for the training job. And we could train right away. So, we could call image_classifier.fit, passing the location of the training set, the validation set, and we would train a single job. And we've done this in the previous episode. But coming back to my batch size and learning rate, we know from experience these have a huge impact on this. And the image classification and the algorithm especially. So, feel free to train a single job and see what accuracy you get. And then how do you do better? So you do better with this hyperparameter tuning capability in SageMaker, which is one of my favorite features because it's super easy to use. So for lazy people like me, it's awesome. It does the work for you. You can fire up your tuning job. Pretend you're working and then you can report really good results and you haven't done anything. SageMaker's already done the work for you. So the way this works is super simple. So you define hyperparameters that you want to optimize on, right? And you define ranges. So for example, here I'm saying, hey, I'd like to explore batch size between 16 and 128. So these are discrete values between 16 and 128, integer values. And I want to explore the learning rate, which is a floating-point value. So we use a continuous parameter. And we like to explore from 0.001 to 2.0, which are, again, reasonable values when you're fine-tuning. And I can specify that this should be explored using a logarithmic scale, because I'm not really interested in trying 0.0002. I really want to try the different orders of magnitude. So I'm really saying, hey, go and explore those different orders of magnitude, don't go and try 0.0015 and 0.0017, you know, it doesn't probably make sense, especially early on. I want to get a quick sense here of what's the kind of learning rate I need to use here, and then we can keep tweaking, of course, and we can run successive tuning jobs where we zoom in on very high-performance parameter ranges. But here we're just trying to get a set. And then we define the tuner where we pass the estimator that we created just above, the metric we would like to optimize for, so by addition accuracy. Which is the ratio of the number of correct predictions to the total number of predictions made. Parameter ranges, how many jobs I want to run. So here I'm only running 20. No, that's not too much. I'll get back to that. We don't need much more. I'm going to run them two by two to speed things up. You could run them sequentially, but you can speed up a little bit. And a prefix, a base name for the tuning jobs. So before we actually go and run it, let's talk about the strategy here. So what really happens here? So we're going to train those first two jobs. And of course, initially, we start for the first job, we randomly select values here. And we see what kind of accuracy we get. And then we apply machine learning optimization, so Bayesian optimization to pick the next set of hyperparameters to try. So we get a second data point, right? And here the data point is really, you know, what accuracy do I get given that pair of batch size and learning rate values, okay? And so you apply optimization to that and you pick another set of parameters. And now you have three data points and four and five. And so gradually you can estimate where to look next. And this is super efficient and generally, it converges quickly. There are other techniques for hyperparameter optimization. There's one called grid search where you explore systematically, you divide those parameter ranges in grids, and you pick values in all the grids, and you systematically explore. So that works, but it usually takes 10x, right? You can argue on that one, but probably 10x the number of jobs. You get similar results. So it takes longer, costs more. Because I'm not only lazy, I'm very cheap. And so should you be, because it's your money you're spending here. I don't pay my bills, as you know. I pay the other ones. There's another technique called random search where we pick parameters at random. And you think, oh, that sounds completely stupid, except this has been proven to work better than GridSearch. And one of the authors of that paper is Yoshua Bengio. If the name means nothing to you, he won a Turing Award for AI contributions not so long ago. So I will not argue with that, okay? Not today, not tomorrow. But random search, again, generally takes more jobs. So in a nutshell, Bayesian works pretty nicely. You can try random. There's an extra parameter here that sets the strategy. By default, it uses Bayesian, but you can use random, and that's a good baseline as well. So if you don't buy my Bayesian works better pitch, try random, and then go try a version and then ping me. And if it's still going to work better, yeah, ping me anyway. We'll figure out what. So that's how you do it. And then we define our training inputs and then we train. Right. But that's, don't go too fast. I just want to show you that actually running a tuning job is not more complicated than running a standalone job. Configure the estimator, define your ranges, define those few things here. So that's the only extra stuff you need to do. And then you call fit, and then we'll see what happens. Now, yes, you are right. When we set our training inputs compared to previous examples in the past few weeks, we have this new parameter that says input mode pipe. So what does that mean? Default is file mode. File mode copies the training set to the training instances. So we just, whatever we have, we copy to the training instances and they start training. So that's fine. But if you have a big dataset, it could take, even if you have a fast network, if you have multi-gigabyte, tens of gigabytes, you need some time to copy. You need to provision space on the instances, as we've seen with that volume size thing earlier. And let's say you have a terabyte dataset. Do you want to copy one terabyte over the network? No. Especially to multiple instances, because if you have a one terabyte dataset, I'm guessing you're going to distribute it. And do you want to provision one terabyte of storage on each training instance, even though it's not super, super expensive? No. So what you want to do is stream. So you don't download, you stream the data to the training instance. So lots of good things about that. First, we waste no time copying. So training pretty much starts immediately. And even if we have super huge datasets, technically, if you have a petabyte dataset, it works, right? Because you don't need any storage on your training instance. So there's no limitation there. And the dataset doesn't need to be loaded in memory. So you kind of decouple dataset size and storage size and memory size on training. So you can scale out to lots of events, even smaller instances because they're only getting small chunks at a time. This pipe mode. Yeah. So pipe mode is super cool. Once you start moving to, let's say, gigabyte scale datasets, it's a good technique. And RecordIO makes it very simple to split because we have this record structure in the file. So it makes it easy to split and create the chunks and send them to instances. See? It all works. We thought about this. So we call fit. What now? Lots of dots. And yeah, an exclamation mark that says I'm done. So what happened there? So what happened there is we started running those training jobs two by two, like I said, initially picking random values for hyperparameters and then using the clever optimization techniques to figure out the next and hopefully over time converging to better, better models. So where do you visualize results? You don't see a training log here because keep in mind we did train 20 jobs. So when you have a standalone training job, you get your training log. When you run, you don't see the training log in notebooks. It would be super confusing to see those 20 logs. So you can easily use this hyperparameter tuning job analytics, which is a very long name, but a simple object, to actually extract all the results as a Pandas data frame. And here I sorted them by descending objective value. We see that the top job reached 0.90 something percent validation accuracy, which is a very good starting point. And we can see this is job 13, right? In the 20th, or probably, yeah, it's the 14th, I guess. I don't know. Do we start at zero? Yeah, we start at zero. Okay. So it's the 14th job. So it took a little while to converge. And then the next ones didn't get any better than that. And if we look at the first few jobs, for example, job zero was 87%. And job one was 88%. Job two was, again, close to 88%. Job three, 87%. So exploration, you know, and five, six. Oh, this is better, okay? So exploration. Six got better. Seven was a bad bet. We explored too far in that hyperparameter space. Not so good. Eight, a little better. Nine, a little better. Ten, not great. Eleven got better. Twelve, not so good. But see, step by step. And of course, it's not a linear process. It's just, you know, we kind of explore and take bets on those parameter values. But, you know, we got to 90 plus percent. Always iteration, right? And if we trained a little more and explored and tweaked more hyperparameters, we could certainly do better. Then, of course, we could add more data. We could try a deeper ResNet. But training time here was, as you can see, very short. It's about four minutes, something like that. Yeah, about four minutes per model. Yeah, you see the seconds here. Three, four minutes. So that's okay. For experimentation, it's pretty good. And of course, we see the winning learning rate, so to speak, was this. And the batch size was this. And of course, we would have never... I thought of using that, right? I don't think so. I don't think so. Okay. So anyway, and we can see the super tiny learning rate doesn't really work. The worst one is actually a very low learning rate of all. So yeah, we can see, okay, this is too tiny. So maybe we could restrict the range and run another tuning job to explore maybe a more meaningful range. We see the top three models are kind of in the same area. So we may want to explore it. You can also go to the SageMaker console and you can see hyperparameter tuning jobs. And okay, so we see the 20 jobs here. We see the best one. Of course, same results. And so we see all the hyperparameters. And if we want to see the actual training log for this one, you know, we'll find them. Okay. So the logs that you actually see in the notebook also, and we see two logs because we have two training instances. And of course, here you can see the log. Checkpointing was actually running automatically. So that's good. So super, super easy to use. So this is how we're almost at the end. So to sum things up, this is how you can easily train a classification model. Summing things up, what did we really do here? We started from a bunch of images, okay? So already labeled. But again, in the future episode, we'll show you labeling with SageMaker Ground Truth. Okay. So no processing in here. With real-life images, probably you would train them. I'm guessing those original images coming out of the medical equipment are high-res. So you could just resize them automatically. And then you train and split. We use a RecordIO format because we think it's cool. We think it's efficient. It's easier to move the files around. It makes it easy to split them. You could work with your image hierarchies if you wanted. We put them in S3. Configure the estimator with that built-in algo with the hyperparameters that we saw. And then we could train a single model, but we go with model tuning, and we enjoy this run for an hour, by the way. And we can go and do something else. Of course, we'll work. We're not slacking. Tuning is just an opportunity to do other interesting things. Exactly. That's who we are. And then we see results. So really not difficult. So if you have your own medical images out there, you know, I'm really hoping we have machine learning engineers and data scientists who work in healthcare and life sciences organizations and companies, go and try it. Start with that notebook, tweak it, tear it apart, add your own images, and see in a couple of hours what kind of results you can get. As you can see, it's really not super difficult. And maybe you can come up with a very impactful result. And yeah, we certainly hope that you achieve them, especially for that kind of application, which are so important for everybody. So let me show you, and yeah, I forgot to say, of course, we could deploy the model, we could automate all of that, and we'll come back to that in the future episodes. Let me show you that slide again. Here we go. In a few weeks, we'll come back to automation and we'll show you how to run complete automation of this. Exactly. And we'll talk about deployment and predicting and other things. So once again, this is the notebook we use today. I hope you got your questions answered. And we'll see you next week for another cool use case on recommendation again, but this time for the retail industry. So we hope you're going to join us next time. And Ségolène, thanks again. Thanks again. A really good one, I think. A really important one. Very important use cases. So go and help people get there. Okay. Until next time, thank you very much for watching and we'll see you soon. Bye-bye.

Tags

HealthcareMachineLearningMedicalImagingSageMakerDeepLearning

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.