Optimize the Architecture of your Platform by Julien Simon Technical Evangelist @ AWS
June 08, 2016
You want to launch your online platform and from a technical perspective you are wondering where to start and how to optimize your architecture?
Cloud Computing presents several advantages such as scaling whenever you want your app our your Website. The hardest part is to define where to begin!
During this 45 minutes workshop, Julien Simon will share with you the best practices to scale your platform from 0 to millions of users. He will present:
How to combine efficiently the tools Amazon Web Services provides, How to set up the best architecture for your platform
How to scale your infrastructure in the Cloud.
Before joining AWS, Julien worked as CTO of Viadeo and Aldebaran Robotics. He also spent more than 3 years as VP Engineering at Criteo. He is particularly interested by architecture, performance, deployment, scalability and data.
Slide are available here: http://www.slideshare.net/_TheFamily/how-to-optimize-the-architecture-of-your-platform-by-julien-simon
At TheFamily, we believe that anyone can become a great entrepreneur. Find more info here: http://www.thefamily.co/
Transcript
Thank you for the introduction. I did a lot of different things before joining AWS. I'm really excited today to talk about AWS. My name is Julien. I'm a technical evangelist for AWS, based in the Paris office, but I travel a lot. It's pretty cool to be in Paris and talk to French people. Most of you are French. I don't know why I'm doing this in English, but okay. It's a fashion thing.
If at any time during the presentation you want to ask a question, if there's something you don't understand, if my English is broken, or if your English is broken, please raise your hand, interrupt meāthis is perfectly okay. And of course, I'll be around at the end of the presentation, so if you want to ask questions in French afterwards, that's fine too.
My goal for today is to make sure you leave this room with a good understanding of all the crazy stuff I'm going to talk about. So please, please ask questions, okay? I changed the title, so I figured, let's change it another time. That's my title for today: Scale Baby Scale. This is really what we're going to do.
Let's get started. You guys are building web applications, web platforms. You'll start in your garage, basement, or dad's basement, who knows? On day one, it's only going to be you. Just one user. So day one, user one, new project. You have this nice business idea that you want to grow into a platform. It's quite likely you'll start with a single EC2 instance, a single virtual machine. You'll put everything on it. You won't worry so much about scaling because that's not the point. The point is to get to your MVP or prototype. So you'll put everything on that instance: the web application, the database, all your tools, etc. You won't worry so much about the architecture. It's a single instance with an IP address that's accessible from the internet, hopefully. And hopefully with the domain name, mysupercoolstartup.com or something like that. My billion-dollar startup.com, actually. So day one, this is good.
Of course, you might launch a first version, a private beta, start building your community, and start adding users to that platform. As you get more traffic, the obvious choice, the gut reaction of everyone, is to say, "Oh, I need a bigger server." That's exactly how you've been trained and told to do things in engineering school or university. You need a bigger server, okay, fine. On AWS, it's very easy to have a bigger server. You stop the existing server, select a bigger one, and start it. It takes about a minute to do. So you'll scale up, right? Add a bigger server with a more powerful CPU, more RAM, etc. More I/O. Very natural way to do things. And that's good. We have a nice selection. You can go to crazy high instances, with 36 cores and hundreds of gigabytes of RAM.
Of course, this will end at some point. If your app is successful, at some point, you'll get more traffic than a single instance can handle. So you'll hit the wall. It's okay in the beginning, but don't wait until you hit the wall to consider a different option. You can still go up to maybe a few hundred, maybe a few thousand users with that solution. But on top of performance issues, you have other issues. You have no failover capabilities. If that instance dies, your application goes down, and that's a bad thing. So no redundancy, and you've got all your eggs in one basket. The web app, the database, all your tools, all on one instance. We need to fix that.
Hopefully, by the second day, you realize this, but it could be day 64. It takes a little more time for some people to understand. Let's say you're super clever and by the second day, you already realize this is not a highly available architecture. So first, you'll split the web part from the database part. That's good practice in any platform to separate the web tier from the database tier. You'll keep your web instance for the web app and then consider where your database is going to live. You have a number of options.
You could run your own database. You could start another EC2 instance and install MySQL, Postgres, or whatever database you want. You could install Oracle if you wanted to and bring your own license and manage it completely. That's the self-managed part. It's fine if you want to do that. You can also consider different options where you use managed services. You don't have to start a virtual machine and install a database; you can access a database service running in AWS. A very popular one is RDS, which stands for Relational Database Service. You can run it with MySQL, Postgres, MariaDB, SQL Server, Oracle, and the newcomer Aurora, which is highly scalable. RDS is very popular; we have more than 100,000 customers using RDS, so it's an important service for our customers. These are transactional databases, SQL-based, etc. If you want NoSQL at this stage of your application, it's unlikely you really need NoSQL because you don't have such large traffic. But okay, you could use DynamoDB, our managed NoSQL service. If you want analytics, data warehouse systems, which again, at this time in your startup, it's unlikely you need that. But maybe for later, you can use Redshift, a parallel, extremely efficient data warehouse service. Let's keep it simple. We're a small startup, building maybe a PHP application with a MySQL server. We'll stick with RDS and RDS with MySQL.
So we're growing slowly. We're picking up traffic. We have hundreds of users. We have a web instance and a separate RDS instance, which is a managed service. This is nice because you don't have to worry about backups or other things; this is part of the managed service. You can focus on building your application and growing your business. More traffic, building your community, and maybe you have a problem. Maybe your server went down, and users start to complain. You really need to start fixing that availability issue. So what we'll do here is start a second web instance, a second web server. We'll put a load balancer in front of it because it needs to be completely transparent to your users that you have multiple web servers. You'll start those instances in different availability zones. Availability zones are part of the AWS infrastructure that are completely independent from one another. Should one availability zone go down, that's a really bad event, but it could happen. Then all the infrastructure resources running in different availability zones are still running. These are backed by multiple data centers, different data centers that are apart from one another. If one AZ goes down, the others keep running. So you need to have instances, web instances, and database instances in different availability zones. This is the best way to build high availability in your application, and it's really easy to do.
Simple architecture: load balancing, a few web servers, and a couple of RDS instances, all running in different physical locations. If something bad happens, you're still up. This is the foundation of many applications and architectures. By just growing that, you could go extremely far. It doesn't need to be more complicated than that. Your business is growing, your mobile app is picking up, you're in the top five downloads of whatever app store you like best, and you're growing to thousands, maybe 100,000 users. You can still rely on that same architecture. It doesn't need to get really complicated. The only thing you would do is maybe add more web servers and more database instances. Just scale out instead of scaling up. Scaling out is the process of adding lots of small servers working in parallel, versus scaling up, which is trying to get a bigger server and a bigger server until there is no bigger server. This is the typical scale-out architecture used by thousands of companies. The only things you need here are load balancing, EC2 instances for your web servers, RDS for your database, and multiple availability zones to guarantee the availability of your application.
I'm a simple person; I like simple things because I understand them better, and it's easier to run, manage, and scale. You could go extremely fancy or extremely simple. Trust me, this will scale very, very high. But does it mean it cannot be improved? Of course not. Everything can be improved. Let's look at how to make this even more scalable and efficient. There's a slight problem with that architecture: we're serving everything from the website, including static content. If you're serving images, JavaScript, CSS, videos, it's a shame to go to the web server, the dynamic web application, to serve that content. The first thing we'll do is offload all the static content to CloudFront. CloudFront is our CDN, our content delivery network. The idea is to store this content in S3, the object store in AWS, where you put all your files, images, and videos. It's stored off-instance and served to your users through CloudFront. CloudFront has 53 or 54 locations, called edge locations, all across the world where users can connect and get their content.
So, if you're hosting your platform in Ireland, your web server and database will run on our servers in Ireland. But if you have a user in Marseille and we have a CloudFront point of presence in Marseille, all the static content will be served from Marseille. The heavy stuff, like videos and pictures, will be downloaded locally from Marseille, with much better performance, while the dynamic stuff will be served from Ireland. This is a nice improvement. We're offloading all that stuff to CloudFront, and potentially, we can use smaller web instances, which are less expensive.
The second thing we want to offload is all the session stuff, all the state stuff you need to manage in a web application, like user data and anything you need to store during a user session. Instances come and go, so you don't want too much local storage on those instances. It's a good idea to offload this to DynamoDB, where you can store session data, cookie data, and anything you need to manage. This gives you performance improvements and peace of mind that if your instance dies, you don't lose any data because it's stored off-instance.
Once we've done that, we've made our platform much more efficient and probably a little safer. You can go even one step further because CloudFront can manage some of your dynamic content. You could put CloudFront directly in front of your load balancer, and in CloudFront, manage some of the dynamic content as well. The idea is to offload as much traffic as possible from the web instances. Anything you can take care of even before the web servers is a good idea.
This is a better platform. Now, let's say we still have more traffic, and we're getting tired of looking at instances, databases, and managing that. We need something to take care of it automatically, and this is called auto scaling. Auto scaling adds and potentially removes resources on demand based on some metrics. You could monitor CPU load, memory usage, network traffic, and say, "Whenever CPU load on my web server exceeds 75%, I need more instances. If my CPU load goes below 20%, I can remove some web servers." Auto scaling is very easy to deploy and used by thousands of companies. By using auto scaling with the same architecture, you can scale to hundreds of thousands of users. The only difference is the dotted line around the instances, representing auto scaling groups. You define metrics and policies, and it just works automatically. You don't need to worry about adapting your infrastructure to your traffic. It will go up when needed and go lower when needed. If you have less traffic at night, you can reduce your infrastructure spend as well.
Auto scaling works in the background, and you don't have to worry about it. Huge companies run with this. For example, a company called WOW, based in Iceland, is an airline company. They're running their whole website and booking system on EC2, RDS, and they scale like crazy. This is a very good architecture, easy to understand, and easy to scale.
We could stop here, but the world is changing a little bit, so we need to look at new technologies. The second part of my talk is about this: "No server is easier to manage than no server." This means if you don't have any servers, life is better. This is a quote from Werner Vogels, the CTO of Amazon.com. He will be at the AWS Summit in Paris on May 31st, and he will take care of the keynote. It's always a good time listening to him, so you should show up.
How could we have no servers? For 15-20 minutes, I've been talking about EC2 servers and RDS instances. How could you have no servers? We use the force. Getting rid of servers means getting rid of EC2, getting rid of virtual machines, and any service that still uses instances, like RDS. We'll try to use only managed services that are fully distributed and completely hide the underlying infrastructure. You just use the service. Of course, I'll talk about AWS Lambda. If you add those two together, you have a serverless architecture.
This might still make no sense to most of you, but I'll show you what some customers are doing. Localytics is an analytics company with decent traffic, 100 billion events per month. They get traffic from the internet, send it to SQS, a message queue, and then process the data, pushing it to Kinesis, another message queue that fits real-time systems well. Messages are ordered and persisted for a week. Lambda functions pull data from Kinesis and invoke microservices. You could build a message queue yourself, but using a managed service that scales automatically and is highly available makes life easier.
Another example is Nordstrom, a US retailer. They wanted to build a recommendation engine. Previously, it took about 20 minutes to deliver a recommendation, which is impractical for an e-commerce website. They built a system using Kinesis, processed by Lambda, pushed to DynamoDB, and then displayed on their website. This is a real-time, highly scalable solution they built themselves using AWS managed services.
AdRoll is an ad tech company with 60 billion events every day. From day one, they decided to use only managed services and Lambda to build and scale their platform. They use Kinesis, Lambda, and DynamoDB, the basis of serverless platforms. They don't have to worry about a single server, which means no monitoring, installing, securing, or fixing. They're closing all their data centers and moving everything to AWS.
Let's look at a real example. I've created an API, a REST API called /prod/logger, to log some events. It goes through the API Gateway, a fully managed service, which calls a Lambda function to write into DynamoDB. A trigger in DynamoDB calls a second Lambda function, which pushes the content to Kinesis Firehose, and finally, it's pushed to S3, ready for consumption by a Hadoop job or anything else.
I'll show you a simple test program. I'm calling the API with curl a thousand times, sending a JSON document. I'm calling the API a thousand times, and within a few seconds or a minute, I should see the data showing up in S3. I'll delete old files and show you the result.
So, I'm calling an API and this should go into the pipeline. I'll show you the steps afterwards. I'm waiting for those files to be delivered as files here. It should take about a minute. There it goes. That's the first example. I can open that file. I get my events, so it's a simple application, calling an API and going through some steps to get the data logged into my S3 object store.
How would you do this in the normal world, not using cloud services? With classical infrastructure, you might log to syslog or log locally and push logs to a central place. That's fine for one server, but what about 200 servers or one million hits per second? How do you get all that data under a minute from the web server to a central location for analytics? One hit per second is easy, but multiply that by 10, 100, and 1000, and it gets tricky.
This is working, and I get a file every minute with my content. How does it work? How many servers? Let's look at the slide again. How many lines of code? Let me show you the different steps.
API Gateway: You can do this in a few clicks or use the CLI. It's a REST API. I created a logger resource and a POST method. My data goes through the API Gateway and to my Lambda function automatically. Lines of code: zero.
Here's my Lambda function. Lambda is about deploying code. When you use EC2, you deploy servers. With Elastic Beanstalk, you deploy packages. With containers, you deploy containers. With Lambda, you deploy functions. This is my function, written in Python. It adds a timestamp to the JSON document and stores the event in my DynamoDB table. Lines of code: three to four lines.
For DynamoDB, you create a table and give scaling information for reads and writes per second. You use the Python SDK to write to it. No need to worry about installing, managing, or patching a database. Security is managed through IAM, Identity and Access Management, where you explicitly allow operations.
Next, I have a trigger. Anytime something is written to DynamoDB, the trigger calls another Lambda function. Lines of code for the trigger: zero. The second Lambda function, also in Python, batches the events and sends them to Firehose. Lines of code: four to five lines.
Firehose is a scalable pipe where you put data and write it to different places within AWS. I'm writing to S3. The only configuration is the bucket, buffer size, and flush interval. One megabyte or 60 seconds, whichever comes first. The data can be compressed and encrypted automatically.
Monitoring shows the data being pushed and stored. We're at the end of the pipe. Refreshing this, I see more files, one every minute. How many servers for Firehose? Zero. How many lines of code? Zero.
So, let's sum that up. How many servers total? None. How many lines of code total? About 10 lines. This is a simple, scalable, and highly available solution built with AWS managed services. Provided you have the right limits, all of them are managed by us, and we guarantee scalability. If you need a thousand Lambda functions running in parallel to store traffic into DynamoDB, it's all transparent. Another benefit of serverless infrastructure is transparent scalability. You should never go down and never have scalability problems. If there is one, it's probably our fault, and we'll fix it.
This is a cool architecture, and you can go very high with it. Some people might say it's hard to build an entire application this way, and they might be right. However, some pieces of your application could fit well here. For example, if you have a mobile app and need to build REST APIs, this is a dream way to do it. For logging or moving data from A to B in a scalable fashion, it's a good approach. The limit is probably how much you're ready to spend. 10 million, why not 100 million? This tweet from one of the founders at Supercell is cool. On March 7th, they had 100 million active users for the first time. They manage 45 billion events every day, many terabytes, and rely heavily on Kinesis, EMR, DynamoDB, S3, and Glacier. They don't worry about infrastructure, only about users, quality of service, and gamer experience. AWS takes care of the plumbing. Cheers to Supercell.
Who is using Airbnb? It's okay to admit it. Who's using Tinder? That guy is looking for a girlfriend. Airbnb is a fantastic success. The ops team is only five people, maybe six now, but not 50. They love their weekends and holidays. They'd rather go on Airbnb and rent great places than spend weekends fixing infrastructure. If five people can run Airbnb's infrastructure, what does that tell you about AWS? This is almost the end of my talk, so now it's your turn. If you're thinking about creating a startup, go for it. The family will help you. Create an account on AWS and start using the services. It will cost approximately zero for a long time because many services have a free tier for the first 12 months. You pay only for usage, and if you shut everything down, you stop paying. Don't believe me, go and play with the services and see what you can build. If you're a developer, you should really look at Lambda. A book on Lambda is coming out, and you can get an early release on the Manning website. It's a great resource.
We'll be at DevOx next week. Anyone coming to DevOx? We'll be at .scale too. More importantly, the AWS Summit is on May 31st in Paris. It's free, with tech and business sessions. If you're a beginner, please show up. If you already know AWS, we have deep dives and cool serverless and IoT sessions. For training, consider the Awesome Day, a free one-day event in Paris every quarter. It covers all main AWS services. If you want a good understanding of AWS, don't miss it. We also have user groups in Paris, Lille, Rennes, Nantes-Bordeaux, Lyon, and Montpellier. Join the Paris group or others on meetup.com. We have monthly sessions and a growing community on Facebook. Follow the official AWS Twitter account.
Thank you for listening. You have my email and Twitter account. If you have questions, feel free to ask. Grab the stickers and business cards. Thanks to the family for inviting me. It's always a pleasure to be here.
What are the metrics that we have to monitor when running a web server or a DB server? It's application-dependent. For a general-purpose web application, CPU load is usually the primary metric. If your application is not CPU-intensive, consider network, disk, or I/O load. Start with one metric, like CPU usage. If it stays above 50% for a while, consider scaling. Don't wait for 90% CPU usage, as that indicates peaks and latency. Scale early to avoid losing users due to a slow website. External monitoring is also crucial. Use services like Catchpoint or New Relic to measure the real user experience.
Start with the smallest instances possible, like T2 micro, especially in the free tier. If your application runs well on T2 micro, there's no need to change. Look at how much time it takes to process a single request and the total throughput. If it takes too long, optimize your code first. If you still have issues, consider a larger instance.
Lambda is a function-as-a-service. You deploy a stateless function that runs in parallel. It's great for microservices, breaking your application into small, independent processing steps. For parts of your platform that require a lot of parallel execution, try building them as Lambda functions.
If you're starting a platform, start on AWS. It's cheap, and you can try various services. For parts that are small processing steps, consider Lambda. For more complex logic, use EC2 or RDS. Try the services and see what works for you.
Life is not always parallel. If you have dependencies between Lambda functions, you can orchestrate them. Lambda A can call Lambda B, or push a message to Kinesis for Lambda B to process. Keep it simple and decide how to link the services.
Lambda is a great way to build microservices, especially with the API Gateway. Microservices are about breaking the monolith into smaller, independently scalable pieces.
Route 53 is our DNS management system. You can host your domains within AWS, and it's highly available and scalable. It supports routing policies like round-robin and latency-based routing. If you have infrastructure in the US and Europe, Route 53 can route users to the nearest region. Look into Route 53 for domain management and routing.
I'm still around for a little while. Grab the stickers and ask more questions if you have them. Thanks again.
Julien Simon is the Chief Evangelist at Arcee AI
, specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.
With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.
Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.
Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.