Building training and deploying machine learning models with Amazon SageMaker July 2020

July 09, 2020
Talk @ AWS Africa Virtual Day, 9/7/2020. An up to date overview of all SageMaker capabilities, with an end to end demo: building a classifier with XGBoost, using SageMaker Debugger, SageMaker Model Monitor, Real-Time Endpoints, Batch Transform, and Spot Instances in the process. ⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️ For more content: * AWS blog: https://aws.amazon.com/blogs/aws/auth... * Medium blog: https://medium.com/@julsimon * YouTube: https://youtube.com/juliensimonfr * Podcast: http://julsimon.buzzsprout.com * Twitter https://twitter.com/@julsimon

Transcript

Hi everybody and welcome to this new session of the AWS Africa Virtual Day. My name is Julien and I'm a tech evangelist with AWS focusing on AI and machine learning. In this session, I'm going to talk about a machine learning service called Amazon SageMaker and how it helps you build, train, and deploy your machine learning models easily and quickly in production. First of all, we need to take a look at the usual machine learning workflow. As you may know, it is pretty complex. It involves a lot of different steps from preparing your dataset, collecting data, cleaning data, building ingestion and ETL workflows, so that you can transform raw data into data that can be used for machine learning. Then, learning from the data and exploring it, you have to select a machine learning algorithm. You're going to use your experience and intuition, but you're also going to experiment a lot. You'll try out a whole bunch of different algorithms that could solve your problem. For example, if you're dealing with a classification problem, you may want to try all kinds of different algorithms for classification, statistical machine learning, deep learning, etc. You need a lot of experimentation to find the algorithms that are most promising. Then you need to train and tune your models. You'll train on the full dataset and hopefully things will work out okay, but sometimes they don't. You may need to debug your models, understand why you're not converging to high accuracy, and even if you are, you want to tune them to squeeze every drop of accuracy from the model. Over that process, you may train tens or hundreds of different models, so it's not easy to manage them. You may waste a lot of time figuring out where a job you ran three days ago is, or the one with a good hyperparameter combination, etc. You need a lot of infrastructure, tools, and time spent building that is time not spent on the machine learning problem itself. By the time you get to a model that performs correctly, you need to deploy it in production. That's where the real problems start because you need to monitor the model, ensure it's predicting correctly and fast enough. If it's connected to business applications or end-user applications and it starts behaving weirdly, problems can escalate quickly. You need to make sure it's all predicting fine and generally keep an eye on your production environment, scale it, manage it, etc., just like you would for a normal web application. As you can see, there's a lot of stuff here. Some of it is data engineering, some is proper data science and machine learning, and a lot is managing infrastructure, building and running training clusters and prediction clusters, and making sure everything works. That's all right, but if you're trying to build a machine learning model, you really want to focus on the machine learning task itself. This is why we built Amazon SageMaker. We built SageMaker to help customers get from early experimentation to scalable production as quickly as possible with minimal fuss and minimal time spent on anything that's not machine learning. Over time, SageMaker has grown into a large service with many new capabilities, and in this session, I'm going to give you an overview of all those capabilities. I won't be able to cover all of them because there are so many and more are added every week, but I'll try to give you an overview and share some pointers and resources on where to go next. At the top of this slide, you see a web-based IDE called SageMaker Studio, which was released at the latest re:Invent. It is a web-based IDE for machine learning, and I'll be using it in the demo later on. We have features and APIs to let you go from collecting, preparing data, choosing an algorithm, to training, tuning, deploying, etc. All those steps I talked about earlier are integrated into SageMaker to make it as easy as possible to go from A to Z. You can see SageMaker as a modular service and a modular set of APIs. If you want to go from experimentation to production with SageMaker, it's great. But maybe that's not what you need. Maybe you need a scalable solution for training and want to train on SageMaker and deploy elsewhere, like IoT devices. That's fine. You can train on SageMaker, grab the model, and deploy it elsewhere. Conversely, you can import an existing model to SageMaker, a model you're happy with, and deploy it at scale for your users in the cloud. So it's a collection of capabilities. Just pick the ones you need for your business use case and ignore the rest. If you're starting from scratch or looking for an end-to-end solution, you can do that on SageMaker as well. Let's go through those different steps. We'll start with data preparation. Depending on your dataset, preparing data can be a bit of work or a ton of work. If you're working with applications like computer vision or natural language processing, datasets tend to be very large and complex, and you'd be on the tons of work end of the spectrum. For example, in autonomous driving, you need to go through tens of thousands, potentially hundreds of thousands, or more images and annotate them. On the right-hand side, you see annotated images, and this is a process called semantic segmentation where we assign every pixel in the image to a specific instance or class. The pink stuff is the road, and the yellowish-greenish stuff is vegetation, etc. You have to do this by hand, so imagine how much time it would take to annotate one image, multiply that by 10,000 or 100,000, and you see the problem. Annotating datasets for natural language processing, entity extraction, sentiment analysis, etc., is also a lot of work because you're likely dealing with hundreds of thousands of different samples. It's not something you want to do manually if you can avoid it. We built a capability called SageMaker Ground Truth, which is totally integrated into SageMaker. You can build workflows where, starting from a dataset in Amazon S3, you distribute samples to be annotated to a workforce, which could be a private workforce, people from your company, or third-party vendors vetted by AWS. You can also use a public workforce through a service called mTurk on Amazon, where you can distribute work to potentially hundreds of thousands of people for large-scale annotation. You can also use a feature called automatic labeling, where, in parallel with human annotations, you train a machine learning model that looks at human annotations. When the model performs as well as humans, it starts labeling at scale, which is faster and cheaper. The combination of human labelers and machine learning labeling helps you label potentially millions of samples in relatively little time and at a much lower cost. This week, we also released a capability to label 3D data, 3D point clouds, typically used for autonomous driving with LiDAR datasets and generally 3D datasets. So there are lots of possibilities here. Not everyone works with such complex datasets, but everyone has to do some sort of processing on datasets. We added a capability called SageMaker Processing, which makes it very easy to run batch jobs for tasks like feature engineering, data cleaning, model evaluation after training, etc. You can bring your own code and run it on your dataset or models in a fully managed way on fully managed infrastructure. All those jobs that are before or after model training can be run easily on SageMaker Processing. It's a convenient way to handle different tasks around training. Now, let's talk about building models. The first step is experimentation, figuring out which algorithm is a good candidate for the model you're trying to solve. A popular way to do this is to use Jupyter notebooks. SageMaker includes notebook instances, which are fully managed EC2 instances from the very small ones to the really large multi-GPU ones. They come pre-installed with tools like Python, open-source libraries like TensorFlow, PyTorch, Scikit-Learn, MXNet, etc. We also have beta support for R. You can experiment with these fully managed notebooks using data stored in S3 and any AWS SDK or Python library. Security is important, so we have features to run instances inside virtual private clouds, encrypt local storage, deny internet access, etc. You can lock down the security configuration of those instances, and it takes just a few minutes to create one. The newer and preferred way to do the same is to use SageMaker Studio, the machine learning environment I mentioned. It's still based on Jupyter notebooks, but you don't have notebook instances anymore. You go through the quick start for Studio, open your environment, and select different sizes without provisioning and managing instances. It's a smoother experience. It also adds features like collaboration, where you can easily share notebooks with colleagues. You can send a link, and they can open it and jump straight into Studio to see your notebook and give you advice. We also have features for experiment management integrated with SageMaker Experiments, which makes it easy to manage hundreds or thousands of different jobs, whether it's processing data or training models. We have integration with SageMaker Autopilot, an AutoML capability for a no-code, zero-code experience to build a model. Generally, you have GUI integration with many SageMaker capabilities for model monitoring, deployment, etc. Some people prefer APIs and writing every line of code, while others like shortcuts and GUIs. SageMaker Studio lets you do both. Here's a screenshot from Studio. It's based on Jupyter, and you'll be very familiar with it. On the left-hand side, the icon toolbar has integrations with all those extra SageMaker capabilities. You can import any library and work the way you're used to, but with a friendlier and easier way to access SageMaker capabilities. When it comes to training and building models, we have different options. Let's go from the easiest to the more advanced. The easiest is to avoid building and training a model altogether. You can save time and money and go straight to production. This is what the AWS Marketplace for machine learning is all about. You may be familiar with the AWS Marketplace for EC2, where we have thousands of third-party software prepackaged and ready to deploy on EC2 instances. We've done the same for models. We have hundreds of models for natural language processing, computer vision, and all kinds of things, including very advanced models. You can find a model that looks like the problem you're trying to solve, deploy it on SageMaker in a few clicks, give it a try, and if it solves the problem, you're done. It's a good place to start your proof of concept and learn more about the problem before deciding if you need to build something custom. The second easiest option is SageMaker Autopilot, an AutoML capability. You bring your data to S3, and we support tabular data for regression and classification problems. You can either click a few times in Studio or call one API in your notebook, and SageMaker Autopilot will automatically inspect the data, prepare feature engineering scripts, build candidate pipelines, launch training jobs, and optimize hyperparameters to get high-accuracy models. If you use Studio with Autopilot, it's a zero-code experience. If you need to build your own model and get more involved, you can select algorithms from three categories. On the left, we have built-in algorithms. We have 17 algorithms today, implemented as Docker containers. You select the algorithms you like, configure the data location, parameters, and train. You don't need to write actual machine learning code; just simple Python code to get everything going. You can also use built-in frameworks like TensorFlow, Keras, PyTorch, etc. We have containers for you, which are open source, so you don't need to build and manage them. You bring your own code, like PyTorch or Scikit-Learn, with minor modifications to run inside those containers. A feature called script mode lets you interface your own code with the container, making it super simple. You can train and deploy anything here. Finally, if you're using R, C++, custom Python, or anything else, you can build your own training and prediction containers and integrate them with SageMaker following simple guidelines. As you can see, there are many options. You can select an off-the-shelf model, use AutoML to build one, or get more involved by selecting an algorithm yourself, using a built-in algorithm, bringing your own code, or building your own container. The common groundwork is that whatever you use, the infrastructure is always fully managed. Training infrastructure is created on demand and terminated automatically, so you never leave anything on and doing nothing. You can use spot instances for training and get typically 60-70% discounts on training costs. For reference, here's a list of algorithms. The orange ones are for supervised learning, the yellow ones for unsupervised. You can see a mix of classical machine learning problems like regression, classification, etc., and some computer vision algorithms for classification and detection. We have algorithms for natural language processing, anomaly detection, etc. Chances are your problem is close to one of these. I recommend looking at the built-in algorithms. They can save you a lot of time, especially if you don't have machine learning skills. We have lots of sample notebooks to get you started, which is a good place if you don't have a lot of machine learning experience. When it comes to frameworks, we have built-in containers for training and prediction that are open source. You can grab them, build them, run them, customize them, and understand exactly how they're built. You can also use a feature called local mode where you can train and predict on your local machine. This is useful in the early stage of the project when you're experimenting and iterating quickly and don't want to wait for managed infrastructure to come up or pay for it. You can reuse local mode and train on your local machine, which could be your laptop, dev server, or a notebook instance in SageMaker. You wouldn't be firing up or managing infrastructure, and you wouldn't be paying for it. It's very useful in the early stage where you only need to experiment with a fraction of the dataset and can scale out easily to managed infrastructure. I mentioned script mode, which is about bringing your code, adding a few lines to read hyperparameters and save the trained model in the right place. You can run any framework code thanks to script mode, making it very easy to do. I quickly mentioned Autopilot. Autopilot is a recent capability for AutoML. Just bring your tabular data to S3, select the column you want to predict, and that's it. You can tell Autopilot, "This is my CSV file in S3. This is the column you need to learn." If you know what kind of problem you're dealing with, you can specify, "Build me a multi-class classification model." But you can also say, "Just figure it out." Feature engineering, training, model tuning, and all those steps are covered, but it's not opaque. You get full visibility and control because you can read and run auto-generated notebooks on candidates. Candidate definitions, training pipelines, and tuning pipelines are fully visible inside those notebooks, and you can run them yourself to understand exactly how data was processed and how the model was built. You can keep tweaking if you want. Once you have a model, if you used Autopilot or the marketplace, you're pretty much done. But if you went for a built-in algorithm, framework, or your own container, you need to train and tune. The SageMaker API is reasonably simple. All training and deployment activities are done using a Python SDK, which we call the SageMaker SDK. I call it a high-level SDK because the objects you deal with are algorithms, training jobs, deployment jobs, etc. You don't deal with servers, VPCs, SSH keys, or infrastructure objects. For experimentation, this is the way to go, and it's what I'll be using in the demo. As a side note, there is another SDK for Spark, which is Python and Scala. It lets you run SageMaker jobs directly from your Spark code, combining the best of Spark and SageMaker for training at scale with your own code. You can also use any language SDK, typically Boto3, the Python SDK for automation and scripting. These are service-level and sometimes infrastructure-level APIs, so I wouldn't recommend them for experimentation because they give you full control but add complexity. The SageMaker SDK, the Python SDK, is what you should use for everyday work, experimentation, training, etc. When you need to tweak every setting or automate and need full control, Boto3 is interesting. SageMaker Experiments is how you manage those hundreds or thousands of jobs. Even a simple project will involve preprocessing data, using SageMaker Processing, processing different versions of datasets, training many jobs, and tuning them. You'll run cross-validation, model evaluation, etc., potentially hundreds of jobs. To make it easier to organize, search, and compare them, we built SageMaker Experiments. You can organize your projects into trials, which are collections of related jobs. You can log all the data associated with those jobs and explore the information using SageMaker Studio or the Experiments SDK. This helps you find that cool job you ran three weeks ago and compare experiments. We've talked about model tuning a few times. Model tuning is important because it helps you find the best hyperparameters automatically. If you try to do this manually, you'll waste time and never hit the top spot. Automatic model tuning is very easy to use. You define parameter ranges you want to explore and fire up a tuning job. SageMaker uses machine learning algorithms, like Bayesian optimization and Gaussian process regression, to find the best parameters. This is part of SageMaker Autopilot, but you can also run it yourself. Once you've built your TensorFlow model, for example, you can launch a tuning job to find the optimal parameters. Just define ranges, the metric you want to optimize for, and let SageMaker do its thing. SageMaker Debugger helps you understand what's going on inside your model. Some algorithms are complex, and people complain that models are impossible to explain. Why is my training job going wrong, or why is this model predicting this way? It's difficult to figure out. SageMaker Debugger runs debugging rules that inspect your training job as it goes. For example, is my loss decreasing? Do I have exploding gradients or vanishing gradients? These rules can detect issues that can ruin your training job. You don't need to wait three days to see if it's a bad job. As the training job progresses, SageMaker Debugger applies those rules to inspect the model state and ensure everything is okay. If not, it stops the job and alerts you. We have built-in rules, and you can build your own. It also lets you save model state, like weights and gradients, and every tensor available inside the model. Metrics and losses are saved to S3, and you can inspect them as the job runs. This helps you visualize the data and understand if things are right or what went wrong. Once you have a model you like, it's time to deploy it. The most popular option is to use a real-time endpoint, an HTTPS endpoint backed by fully managed infrastructure that you can invoke with HTTP POST data to get predictions. This is one line of code with the SageMaker SDK. There's an API called Deploy, and that's it. You can set up auto-scaling if needed. The endpoint will stay up until you explicitly delete it. Another option is batch transform. For some customers, real-time prediction is not needed. They might need to predict 10 gigabytes of data once a week, so they use batch transform to process that batch of data and put results in S3. This is also one line of code and very easy to do. You can also export the model. If you have a company policy that models need to be deployed to container services, the model is stored in S3 and is a standard model. You can grab it from S3 and deploy it anywhere you like, whether it's a container service or your laptop. All these combinations work at different points in a project. Maybe early on, you want the model on your laptop for testing, and for production, you want real-time endpoints or containers. SageMaker gives you the freedom to do this. The only difference is that if you work with real-time endpoints and batch transform, you're running on fully managed infrastructure, so no worries, one line of code, and all infrastructure work is handled. If you're working with other solutions, you have to deal with infrastructure. The final one I want to mention is Model Monitor. As the name implies, it's a monitoring capability for endpoints. It helps you capture data sent to the endpoint and predictions returned by the endpoint. You can set a sampling threshold, like 10%, and it logs to S3 the data sent and predictions returned by the endpoint. This is useful for running analytics or capturing real-life traffic for testing. You can also enable monitoring to look for violations in data, such as data that doesn't look like the data used for training. This could be buggy data with missing features or mistyped features, or more subtle issues like data drift, where the statistical distribution of a feature is different. This can break training assumptions and is hard to detect. Model Monitor captures data and keeps an eye on it, generating violation reports and alerting you that the data you're receiving now is not similar to the data you trained on. You can set alerts, etc. Let's do a demo. I'll switch to my browser. Here I'm using Studio. If you're curious how to do that, go to the SageMaker console and select a region where Studio is available. Click on Studio, and there's a quick start process to create your Studio user. It takes a few minutes, and then you can open Studio, which runs in the browser. It's Jupyter as usual, and you can launch notebooks and use many features. Let's run the notebook now. This dataset is a direct marketing dataset with about 41,000 samples and is a binary classification problem where we're trying to predict if a customer will accept a marketing offer, yes or no. We download the dataset, extract it, and take a look. It's a CSV file, but it's not pleasant to look at. We can use the pandas library for a nicer visualization. We open the CSV file, display the first few lines, and see features like age, job, marital status, education, etc. We have 21 columns and a little more than 41,000 samples. The last column, called Y, is the label, indicating whether the customer accepted the offer. This is an unbalanced dataset, with more no's than yes's, but we have some yes's. The ratio is almost 8 to 1, so it's unbalanced, which could be a problem. Now we need to do some basic transformations. I'll go a bit faster because we don't want to focus too much on feature engineering. You'll get this notebook, and the link is on the previous slide. I'm removing placeholder values and merging some categories like students, retired, and unemployed, creating a new column called "not working" to help the model understand the relationship. I get rid of categorical variables like job and education using one-hot encoding, where a single column is replaced by as many columns as we have different jobs. For example, job admin, job blue-collar, job entrepreneur, etc. We have one column per job type and flag the actual job for that customer. We do the same for marital status and education, increasing the number of features and making the dataset harder to understand for humans but easier for the algorithm. We end up with 66 features, split the dataset for validation and training, and save the three splits to CSV files. If you didn't like this data preparation phase, I understand. This is why we built SageMaker Autopilot. If we were using Autopilot, we wouldn't do any of this. We would bring the CSV file to S3, call an API, and say, "Hey, we want to predict the Y column. Go and build me a model." In the repo that contains this example, there's an Autopilot example as well, but I want to show you SageMaker in more detail. Now that we're done preparing the data, we need to upload it to S3. We upload the training set, validation set, and test set to an S3 bucket and get three S3 URIs for those files. Now we can get to training. Using the estimator object from the SageMaker SDK, which is the generic object for training jobs, I configure everything here. We create a new estimator and the first parameter is the algorithm we're going to use, XGBoost, one of the built-in algorithms. We get the container name for XGBoost in the region we're running in and specify XGBoost 1.0. We need a role, which is a collection of permissions allowing SageMaker to read and write objects in S3 and grab the container. When running on S3, notebook instances, or SageMaker Studio, you can use the built-in role created when setting everything up. The session can be ignored. This is the input dataset, which says, "Please copy the dataset to the training instance." It's a very small dataset, so there's no problem copying it. The alternative is to use a feature called pipe mode, where we stream the dataset to the training instance or instances if using distributed training. Pipe mode is great for large datasets. If you have gigabytes or more, you don't want to copy that data to training instances, so you can start training immediately by streaming data. The output path is where we save the trained model. These are our infrastructure requirements. This is as much infrastructure as you'll deal with in SageMaker. If you're a little scared of instances, VPCs, security groups, SSH keys, and subnets, fear not because they're all gone unless you go into advanced configurations. We train on one m4.xlarge instance, a CPU instance. If we needed distributed training for a larger dataset, we would say, "Hey, give me 10 instances," and SageMaker would take care of everything. No infrastructure work at all. Next, we have spot instances, a well-known technique to optimize costs for EC2 instances, also available for SageMaker training. You tap into unused capacity and get typically 60-70% discounts. The trade-off is that if we need to reclaim that capacity for on-demand instances, you get a two-minute notification, and the instance is terminated. On SageMaker, things are simple. We still reclaim instances if needed, but SageMaker automatically restarts the training job. If you set up checkpointing, SageMaker will restart from the latest checkpoint. TensorFlow checkpoints automatically, and some built-in algorithms checkpoint automatically. XGBoost doesn't, so if you have a checkpoint, we'll restart from the latest one. If not, we just restart the job. We'll save some money here. Finally, we have SageMaker Debugger. The first chunk is data collection, where we ask SageMaker to store metrics and tensors in S3. We'll see training accuracy, validation accuracy, etc., and save at every step. It's not a lot of data, and it's not a complex model, so we can save everything. If you work with deep learning models, you might cut down on the amount of data with a longer interval. We're also asking for feature importance, a cool way to know which features matter most for prediction. XGBoost also supports SHAP values if you're into that. You can save the average SHAP and full SHAP values, but we'll stick to feature importance. The second chunk is rules. We can configure rules to inspect our jobs, and here we're checking for class imbalance. We know the dataset is imbalanced about 8 to 1, which is not terrible, but we set up a rule just to show you how to do it. This is a built-in rule, but you can bring your own rules. All this configuration is where we set up the algorithm, how data is sent to the training instance, how much infrastructure we want to train on, use Spot to save money, collect tensors, and apply rules. We set hyperparameters based on the documentation. We want to build a binary classification model and train with the AUC metric, area under the curve, which is good for classifiers. We train for 100 rounds and set early stopping at 10 rounds. If the AUC metric stops improving for 10 rounds, we cut the training to save money and avoid overfitting. We call fit and pass the S3 data parameter, which is the location of our data in S3. We set the training channel and validation channel. The training job fires up, launching the m4.xlarge instance we requested. We also see the debugging job, FireGraph, which inspects the model while it's training. We download the input data to the instance, which takes about two minutes. We download the XGBoost container image to the instance and start training. We see the training log in the notebook and CloudWatch logs, showing metrics improving. We train for a total of 57 seconds but only pay for 12 seconds because of the spot instance. We saved 78.9% by using Spot. If you have short jobs unlikely to be interrupted, using Spot training is mandatory. Look at those costs. As we configured SageMaker Debugger, we saved tensors. We can ask for the S3 location of those tensors and use utility code for plotting. We can grab collections and tensors inside a collection and plot them with matplotlib. We can plot the metrics collection, which has two tensors: training AUC and validation AUC. We can also plot feature importance, showing the weight of each feature. The most important feature is job, followed by housing, which makes sense. If you have a good job and a nice house, you're more likely to accept a marketing offer. We can deploy the model very easily by calling the deploy API. We deploy it to one m4.xlarge instance and capture data using SageMaker Model Monitor. We capture 100% of all requests and responses and store them in S3. We wait for the endpoint to come up and predict by calling the invoke endpoint API and sending some CSV data. We get predictions between 0 and 1, and we can set a threshold, like 0.5, to decide yes or no. We can also look at the capture path and see JSON line files showing data sent to the endpoint and predictions output by the endpoint. We can set violation reports, etc., which you can see in the sample notebooks. Batch prediction is just as easy. We code transform and pass the location of the data in S3. We fire up a managed transformer, which crunches through the data and saves predictions to S3. We can copy those results from S3 and view them. We can delete the endpoint when done. This is a first run through SageMaker with the main features. There are many more, but as you can see, it's a simple SDK, and you never worry about infrastructure. You focus on building and tweaking models and understanding what's going on. If you have no machine learning experience, you can get the job done. If you do have machine learning experience, you can save a ton of time by using SageMaker features and focusing on the problem at hand. Where do you get started? Look at the free tier URL, where you can use SageMaker completely for free under certain conditions. Make sure you're within those conditions to learn about SageMaker for zero cents. If you want to know more about our machine learning services, visit ml.aws, where you'll find all the services and customer stories. The SageMaker page focuses on SageMaker features, customer stories, and use cases. The Python SDK on GitHub is the one I used. The Spark SDK is also available. The collection of SageMaker examples on GitHub has hundreds of notebooks showing built-in algorithms, frameworks, and configurations. This is the best way to understand how SageMaker works. You can also find some of my notebooks on GitLab and plenty of SageMaker and machine learning videos on YouTube, as well as my podcast. I also blog on Medium. I hope this was useful, and I hope you learned a few things. I hope you want to try SageMaker. Please feel free to get in touch via LinkedIn, Twitter, etc., if you have questions or feedback. Thanks again, and enjoy the rest of your day. Bye-bye.

Tags

AWSAmazon SageMakerMachine Learning Workflow

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.