Running BSD on AWS

Transcript

All right. Good afternoon, everyone. I'm Nicolas, and this is Julien from Arcee, just so you won't mix us up. I'm a technical trainer based in Paris, delivering trainings for AWS all over EMEA. Julien is our beloved technical evangelist for the French region, but he's not really in France; he's all over the world. Today, we're going to discuss running BSD, mostly FreeBSD and a little bit of OpenBSD, on AWS. We thought we'd have some fun with these two operating systems. So, this is a little bit about us. I'll start with myself and then hand you the mic. I discovered OpenBSD in the early 2000s, and the Spark station you can see here is the first machine I installed OpenBSD on. This was a long time ago. Shortly after that, I was one of the few French people learning about OpenBSD, so I started the OpenBSD France community, which I ended earlier this year due to lack of time. I've been learning Unix in general, starting from an OpenBSD background. Hi, I'm Julien. I'm an older guy, so I want to tell you when I started with open source. You can guess by the age of my CD-ROM collection. Most of you were probably not even born then, which is a terrible thought. Back in 1996, I think, I translated a book, which is probably in your library, the French version of Kirk McCusick's book on BSD. I know he was here a few days ago, and it's a tragedy that I couldn't meet him because I was traveling. So, hi Kirk. Me again. All right. And... No shit. It's this guy. Stick around, all right? It's the first time we meet. It's a... Wow. I know I cannot do this presentation anymore. I feel so worthless. Oh man. You should be filming this guy. Yeah. I didn't bring it. But that's okay. You made my day. He's a legend. He's a proper legend. All right. So now we should be really good, right? Yeah. Okay. Here's the agenda for now. We're going to talk about the AWS infrastructure briefly, just to show you what kind of architecture we have. Then, Nicolas will talk about instances, virtual machines, and operating systems. We'll start some benchmarks because it's important to see how fast things are running. We'll spend some time discussing how to build Amazon Machine Images (AMIs) with BSD, which is an interesting process. We'll talk about automation quite a bit. Then, we'll look at the results of our benchmarks, which hopefully will be complete. We'll take your Q&A, and we decided to take your questions during the session to make it more interactive. If there's anything that doesn't make sense, anything you disagree with, or if you want to throw stuff at us, that's okay. We can take it. We've seen worse. A word about infrastructure. As you probably know, our infrastructure is spread across 16 regions worldwide. These regions are broken into availability zones, which are infrastructure partitions close enough to allow for replication, etc., but far enough so that if one of them explodes or if a volcano erupts in Dublin, the others should keep running. We have a lot of edge locations, and this number is probably already false. These are the locations for the CloudFront service, our content delivery network, which spans the globe. A region is a number of data centers, and when I say "number," it's more than one. Some competitors call regions one data center, but for us, it's much more because we think redundancy and high availability (HA) are really important. They are fully connected with a very low-latency network, usually less than one millisecond, which allows us to replicate data, storage, databases, etc., even synchronously. We're going to open a region in France this year, between now and December 31st, 11:59 PM. People keep asking me for the exact day on Twitter, but it's this year. If you want to know more about this, there's a brilliant presentation from re:Invent by James Hamilton, one of our infrastructure gurus. I cannot recommend it highly enough. It's legendary too. Inside an availability zone (AZ), we have multiple data centers. Let's talk about the French AZs that are coming. Each AZ will have at least one data center, but it's always more. The largest AZs could have six or seven in the US, which are older and at a higher scale. Each AZ will have multiple data centers. They are close enough for synchronous operations but distant enough so that if one is broken or has a power failure, fire, or disaster, the others should keep running. Network latency is very low, much less than a millisecond. This shows that we have multiple levels of redundancy. We could lose a data center and not lose an AZ, lose an AZ and not lose a region, and, God forbid, lose a region. If you have a multi-region design, you would probably be okay. I guess we'll find out when we lose a region, right? Not soon, I hope. Inside the data center, we have racks and servers, which is not really original, except it's all custom. If you want to know more about this, please look at James Hamilton's keynote. It goes into some detail about the servers, routing, and network equipment. It's all custom, 100%. We decided not to have very large data centers. We could have more than 50,000 servers in a data center, but if they get too big, the impact of a failure is terrible. So, we stick to mid-sized data centers and build many of them next to one another. Okay? Makes sense? All right, Mr. David, your turn. The mic and the remote. Let's talk about instances, those virtual machines in the cloud, and operating systems. I'll start by saying we call them EC2. There will never be any EC3 or EC1. EC2 stands for Elastic Cloud Compute. These machines are virtual machines based on the Xen hypervisor for now. In the future, some things will evolve. We signed a partnership with VMware last year, and things have grown really fast. This hypervisor is now available in one of the regions in the US, US East 1, around Washington. To run these machines, you need hardware and software. This is where it gets very interesting. You have multiple places to grab an AMI, an Amazon Machine Image, which is an OS template containing your OS plus more stuff. By default, AWS gives you access to AMIs for Windows, Linux, and some BSDs, like FreeBSD. We're hoping to have OpenBSD soon, very soon. These AMIs are pretty basic, with the latest kernel, latest patches, and all that stuff. But you might want something more custom, like adding your own layer of security tools, a custom kernel, etc. So, you might want to create your own AMI. This is one of the things we're attempting to do on OpenBSD using some of the stuff Antoine Jacuto has been doing. You can create your own AMI, put all your stuff in there, and eventually share it with the world, which becomes really interesting when you want to distribute your software. In the past, it used to be floppies, CDs, DVDs, downloads, and now you can run your own server with the application already pre-installed, so no one can tell you, "Well, you've done it wrong." The software editor or the person granting access to this AMI ensures that the AMI has the right configuration and tools, so when you boot your instance, everything is running fine. You might want to make some money with this AMI at some point. For that, we have the Marketplace, which allows you to sell, by the hour, the license to your AMI or the software installed on the AMI. You can pay a little per hour or per agreement, with the one-year term being the most common. There are four places to get your AMI. Inside the AMI, you have the software and the OS. You want to instantiate that on some hardware. We have a combination of hardware options, which you'll see in the next slides. Until now, you had to pay by the hour for these resources: compute, storage, and network. Starting October 2nd, you will pay on a per-second basis, which is very interesting because most of the stuff you run doesn't require an entire hour. Maybe you need five minutes and don't want to pay for an entire hour, so you'll pay for one minute of boot time and the rest for the running time of your instance. This is coming up on October 2nd and is only for Linux. The Windows license might have some things to consider. You might think, "This VM is cheaper elsewhere." It's not really the case. You might want to compare apples to oranges. We have a broad geographical coverage and a bunch of services, an ecosystem of services running with those EC2 instances. There are about 100 services now, from basic stuff like compute, storage, and network to IoT, machine learning, elastic search, and more. Inside the region, as Julien pointed out, we also have multiple availability zones allowing you to have high availability for your workloads and synchronize your data between these different availability zones. Let's look at the instance types. The naming scheme is pretty easy: the family, the generation, and the size, just like t-shirts. We made it simple, except sometimes the size goes really big. Like this one. You might want to buy sheets instead of a shirt that size. Toga party. We have GPUs in some instance families, like the G3 and P2 families. The P2 is quite exceptional, with 16 GPUs, NVIDIA Quadros, and about 12 gigs of RAM on each GPU. This gives you a lot of possibilities for computation. If you're looking at memory-optimized instances, the R family or the X family, which is the biggest we have so far, has 128 cores and 4 terabytes of RAM with a 25 gig network interface. We're extending it soon to 16 terabytes of RAM. If you're running in-memory databases or caches, this is probably one of the sweetest instances to run your stuff on. The smallest instance has one core and half a gig of RAM, and the biggest has 128 cores and 4 terabytes of RAM. You will have to choose something that fits your needs. One of the things we'll highlight with Julien is that the size doesn't matter most of the time. The biggest instance size all the way to the most broad family, the T2 family, has Intel Xeon CPUs. On these CPUs, you can tap into a lot of things like instruction sets, P-states, and C-states control to tweak your application. Today, Julien will be performing a few things on the i3 family, the C4 family, and the X1 family. The i3 family uses Intel E7 Haswell processors with a 25 gig network interface. The C4 instance uses Haswell processors at 2.9 gigahertz instead of the usual 2.3 maximum. This means we have custom CPUs. Intel is one of our partners, and we buy a lot of CPUs from them. At some point, we wanted to make a market differentiator. We wanted the same CPU that everyone else can have but at a higher frequency. This is about 30% more performance than regular CPUs and is only available on AWS. The X1 is a multiprocessor architecture with a NUMA architecture. Each socket has dedicated memory, which is how you get to 4TB and such large amounts. If the OS supports it, we can migrate frequently accessed pages to the closest CPU. If CPU1 is accessing memory from CPU0 quite a lot, we can migrate that stuff to the closest CPU, but that requires OS support. Thank you, Julien. The last family I want to talk about is the i3 family. i3 stands for I/O, and we mean a ton of I/O with this family of instances. This is the newest generation of instances using NVMe storage, which is quite fast, and you can get up to 3.3 million IOPS. This is incredibly fast. The same 25 gig elastic network interface (ENI) is available on this instance, with half a terabyte of RAM and 64 cores. Again, custom CPU for this one as well. This machine promises a lot of performance. We have storage, compute, and memory, and we need to know about the storage. This is where we'll store or instantiate our AMI. We have two main families of storage: classical standard magnetic hard drives and SSD drives. For EBS volumes, we call them Elastic Block Store (EBS). We have an advantage, for example, with magnetic hard drives, where we focus on throughput. 250 megs per second on cold drives and 500 megs per second on throughput-optimized drives. For SSD drives, we have two types: general purpose and provisioned IOPS. General purpose can burst up to 10,000 IOPS, and you can merge or RAID one, two, three, or four of them to go up to 40,000 IOPS. For databases requiring a lot of IOPS, we have provisioned IOPS, which deliver a constant 20,000 IOPS. You can RAID two or more to get up to 40,000 IOPS. This is probably going to be RAID 0 at a minimum. Behind each block we present for these types of storage are multiple physical blocks, so if we lose a hard drive, you don't lose your data. One question I often get is how we destroy hard drives. Do we have a special process? We do have an actual process from the Department of Defense in the US. It's a three-step process. First, demagnetize. Second, drill holes at regular intervals. Third, shred them. This way, you can grow new hard drives or separate metal parts from the rest. EBS storage is something you have to pay for, but there's another option that is free. We talked about the 3.3 million IOPS for the i3 family. These are the hard drives attached to the hypervisor. For the i3 family, you get 3.3 million IOPS and up to 15.2 terabytes of storage, and these hard drives are free. They are really performant. However, the storage is not persistent. When you start your instance, it starts on a hypervisor. If you stop and start your instance again, you have more chances of winning the lottery than running the instance on the same hypervisor. For security and privacy, we won't copy data from one hypervisor to another. So, you will lose this data. However, if you can work with that, knowing this storage can be used for temporary files, transformations, videos, etc., it's really fast. One of our customers, Netflix, uses more than 100,000 instances on AWS and no longer uses EBS drives because of the costs and performance compared to instance storage. Instance storage is really fast and free, but you have to work with that. This is Julien's part. Thanks for all this information on instances. They have very different specs, and we want to know how fast they are on real-life workloads. Benchmarking is awesome, and synthetic benchmarks can be useful up to a point, but at the end of the day, you want to run a real workload and see what happens. So, I've picked the largest C4, X1, and i3 instances. Here are the specs and the setup. We're going to build a world on FreeBSD and see what happens. I'm using 11.1 release, which is the AMI available in the marketplace. I think it's faster with the latest versions, but the AMI is still 11.1. I wanted to run a test that anyone can replay in five minutes. The C4 has a bit of memory, a few cores, network storage, and I'm using provisioned IOPS for a reliable IO level. I'm using UFS as the file system. The X1 has a ton of memory, a ton of cores, and instant store, so I have about 4 terabytes of local SSD to build on. For the i3, I have quite a bit of memory, a few cores, and the new generation of SSD called NVMe. Since I have eight volumes, I figured I might give ZFS a try. I'm madly in love with ZFS, so that's my excuse for trying it. We're going to build and see what happens. I also ran these numbers with a RAM disk, and we'll see what happens there. We'll talk about the hourly price of each instance. Just a few more details before we do this. I'm building on these three instance types. For the X1, I have two local SSDs, so I'm using one for USRSRC and one for usr.obj. I'm going to use all available cores to build. On the C4, I'm putting both directories on the same network volume with 10k IOPS. I'm using all cores. For the i3, I'm creating two ZFS pools for src and obj and using all cores. Let's do the setup quickly, launch the benchmarks, and then Nicolas will explain how to build AMIs. Hold on one second, Julien. I want to see your terminal. It's all black. Probably so. No, not the region. Ah, there we go. This is my EC2 console. As you can see, I'm running my three instances. I'm going to SSH to each of them, which should already be the case. Here's my C4, here's my X1, and here's my i3. In the interest of time, I have all the commands ready. For the C4, I'm just extracting sources and building world on 36 cores. I'm just making sure this is the right instance. Yeah, that should be quite fast. On the X1, I can see my two instant store volumes here, XBD1 and XBD2. I'm just going to format them, mount them, and start building. That should be fast. That's the X1, right? Okay. And yeah, go C4. Alright. And now I can do pretty much the same thing. Extract sources and build. Maybe you want to see that actually happening. Well, of course. So, C4 is starting to build. On the i3, I can see my eight NVMe volumes. I'm going to quickly build my pools. Okay. Alright, so X1 is building. Okay, and same thing here. Okay, so I can see my two pools. We're ready to go. Extract the sources and build. This is going to run for a few minutes. I just want to make sure this one is actually starting before handing the mic back. Yeah, so we're on our way. And this one, I guess... Oh yeah. Okay, this one is starting too, right? Okay, so all three instances are building. Let's show you these specs once again so you can decide which one you think is going to be fastest. Don't lie to yourself. Pick one and don't change your mind. Who's going for the C4? Just by raising hands. Yeah, come on. It's going to be closer than you think anyway. So, okay. Who goes for X1? All right. Who goes for the i3? Okay. So, you guys don't believe too much in the C4, right? Okay. And the rest is pretty much split. Most people are split between X1 and i3. Okay. But the brave soul, I said the C4 would win. Okay. There's always a brave soul. Okay. So that's fine. We need brave souls in this silly world. All right. So it's going, right? Okay. So let's switch back to slides. We can check the results at the end, and Nicolas will go into the process of building OpenBSD AMIs. I'm a FreeBSD guy, but come on. We're brothers, right? So, your turn. All right. Thank you, Julien. While Julien's build world is going on for all those instances, we'll talk about building BSD AMIs. Laurent in the back of the room, raise your hand. He had a presentation yesterday on how to build stuff with Consul and OpenBSD. Part of what you use could probably be taken care of from the marketplace. There's a lot available there. Today, I want to talk about using a few tools other than just the console, maybe in complement, like Packer, which is from the same company. Maybe some other tools like the CLI, which I really like about AWS, or eventually Aminator for other OSs to build your own AMI. There's one template we've shared to build your own CI-CD pipeline. The idea behind what I want to show you is how we can bring some of the stuff we do manually on OpenBSD up to CI-CD and speed up some processes, check more stuff, and maybe use managed services. For those who don't know, managed services are the same as what you're doing with your hands but with no hands. They are usually cheaper, as secure if not more secure, and have a pay-as-you-go model. If you consume it, you pay for it, like turning on the light in a room. If you turn off the light, you don't pay for it anymore. This is the idea behind most managed services on AWS. We're going to build an OpenBSD AMI factory. We'll have a host running OpenBSD with about 12 gigs available for the AMI we're going to create and about 4 gigs of temporary files. We'll also use the Create AMI script from Antoine, which brings everything together on a local file system and then pushes it to a storage service on AWS called S3. S3 stands for Simple Storage Service, one of the oldest AWS services, about 12 or 13 years old. In the Ireland region, it's one of the coolest services I use, costing about 2.2 cents per gig. It can also trigger notifications upon the arrival of a file. If something happens, I can use this notification to run some code, maybe modify my infrastructure, just as Laurent showed in his previous presentation. Here's what I want to do. I want to commit my code and then trigger a service called Lambda. Lambda is a container-managed service that runs your code in Java, JavaScript, Python, or C-sharp upon notification. It runs between 100 milliseconds and five minutes, and the first million executions of Lambda are free forever. The second million costs 20 cents. So, I just need to run this for maybe a split second to notify my OpenBSD host to create a new AMI that includes the code I committed and then notify me that the AMI is ready. Maybe you can use it with something called CodePipeline. CodePipeline, similar to Jenkins, is a managed way and very API, AWS, and developer services oriented. Once we have this notification, CodePipeline can trigger other services, like CloudFormation. CloudFormation is one of my favorite services, just like Lambda and S3. CloudFormation allows you to describe your infrastructure using either JSON or YAML. It then takes all this information, scans it, identifies the resources needed to be created first, like network and security, and then the resources that can be created in parallel. You can create your entire infrastructure really fast. This can be used for many things, like deploying your application in UAT or your DR with all that stuff. The cool thing about CloudFormation is that every time you create something, it creates a stack that can be updated with different behaviors or, the main advantage, creates idempotent stuff. The very same thing all the time, which humans are not as good at as services. Once this is done, CloudFormation will deploy the application in UAT. You can then run security and compliance tests with a service called Inspector, a managed version of Nessus. We have a different set of tests or books of tests, some of which are quite interesting, like PCI DSS compliant. You might want to do load tests with a tool called Bees with Machine Guns from News Corp. You might want to test load and security at the same time to see if your application behaves the same way under full load or more than expected load. You might want to test features to see if they still work and if manual intervention is needed or can be automated. With this, you have some results and can feed that back to the developer, saying, "Hey, you missed out on something here" or "The percentage of comments versus the percentage of code is not good enough." Things will then move from UAT to production. The goal is to use blue-green deployment methods, with the blue being the existing environment and the green being the new environment, to switch from one to another without interrupting the customer's experience. I want my stuff to work. So, let's do some demos. Did I do enough chickens today so that my demos will run well? Yeah, it's going to work. Okay, so let's take this one. Move on to here and SSH to my OpenBSD host. As you can see, this is 6.1. A little bit of stuff going on here. I haven't cleaned my stuff since this morning. Actually, since an hour ago. It's on time. Just in time delivery. I could do this automatically, but I want you guys to see how things are working. Do you have this open in Atom already? No, this is yours. There we go. So, there's a lot of information we want you to see. This is the stuff I will be loading. I'll be exporting some stuff and adding some stuff. I'm showing my keys. I don't like that. There we go. I'm going to set a mirror for Ireland as my machine is running in the Ireland region. Then I'm going to add some cool stuff, clone my repository, make some modifications, and then generate the AMI. Let's cut and paste. There we go. I'm cloning some stuff, getting the script, and creating the AMI. Creating the storage and all the stuff I want. Once this is done, I will notify the rest of my applications in the pipeline via a service called SNS for Simple Notification Service. It can send a lot of types of notifications. The one I'm going to use signifies the end of a task to Lambda so that Lambda can run with the rest of it. As we're building, this is something none of you have seen before, except my keys. As I'm building, this is one of the things I want to draw your attention to. This process is working, but it takes a little bit of time. Some of the tools we're using are not up to date, and some of the drivers might not be giving their best. This is one of the things we might want you to help with. Help us make it better on AWS. There's a lot that can be done. We have a slide later on for FreeBSD as well as some of the stuff we need help with. There are a lot of people running NetBSD in Australia on AWS, and in the US and Europe, in Canada, we're starting to see a lot of OpenBSD. Thanks to Antoine and Rake. Yes, that helped us. All right. So this is being built. Let me go back to the slides real quick. There we go. So, we're at this stage right here. We're pushing the notification here and then building this. Once this is done, the notification will go here, CodePipeline will trigger, launch CloudFormation, and deploy the application. CloudFormation is quite easy once you get used to it, but it's only for AWS. One of the services is dedicated to AWS. However, if you want something more platform-agnostic, you've probably heard of a tool called Terraform. It's pretty cool because you use one DSN and then plug whatever you want behind it and start running about the same thing. Again, apples and oranges in different environments. I've seen a lot of customers doing this with VMware because it was already there, and you bought those very costly licenses. So, you need to use them at some point. By API interactions, by CloudFormation, you can do stuff inside AWS or outside AWS. I could talk about Chef, Puppet, Ansible, or Salt, where once your application is deployed, you have different configurations between UAT and production. You might put everything that is most stable into one place and then add the moving parts later. Once your AMI has been baked and used to deploy and create new instances in your environments, you can modify the configuration. These configuration management tools are really good for doing this at a large scale. Let's face it, I'm doing it with a few instances here, but you could do it for hundreds, 100,000 instances, maybe at the scale of some of our largest customers. You can do that. Absolutely. Most of what I'm doing is a one-shot thing just to show you how it would run the first time. Later, you might need to maintain more AMIs, maybe more versions of your application. Think about it. One AMI per version of your application. The boot time of a machine is one to three minutes for most UNIXs, quite fast. So, you can boot that and deploy it with the template that was the version of your application. You have a template and an AMI. You can deploy that very easily. Each AMI has a unique ID, like AMI-dash-something. You'd have to build some tools, maybe with automation, with the CLI, with some scripts, to manage those AMIs. You pay for the storage, so as you create more AMIs, you might want to have an automated way to recreate those AMIs quickly and make cost decisions. Storage or image generation time will take more or less money, so you have to tailor it to your needs. Does that make sense? So, while this image is building, it's taking a little bit of time, a little bit too much time actually. This is why I was asking for help. Right now, it's about 20 to 25 minutes to build an OpenBSD AMI. It's still reasonable by automation, pretty good, but we can make it faster. We can make it a lot faster. Drivers and tools are your best friends. Once you've committed, backed the AMI, notified your teams and code pipeline, deployed a new UAT environment using this new AMI, tested your application, and everything is satisfactory, you move on to the next stage. Can you create your AMI locally and then upload it to storage? This is what we're doing. Maybe Laurent's technique was a little bit faster than mine. From your experience, how long does it take to build an AMI from Vagrant? With Packer, it takes maybe 10 minutes, maximum? So, with different tools, we can split the building time in half. I'm sure we can go a lot faster. I'm really sure. Different techniques, maybe different results. You don't have to pick one or the other. Most of the time, you want to be on a stable OS and maybe rebuild the AMI and add extra stuff. When there's a new version coming out, you might want to rebuild completely from scratch. It's a combination. Most of the time, you're just going to be deploying your app anyway on the latest AMI you have. So, it's going to be really fast. It could be just deploying the app, doing what you were doing, starting from a stable OS, and building or rebuilding your AMI. All three make sense at some point in your development process. The takeaways from this are that DevOps is for AMIs but also maybe for containers. You've probably seen the process resembling other stuff like Docker. Try to use services instead of servers to make more time to experiment with more services. This is clearly going towards DevOps. Security is very important. When granting those services access to different parts, we have a service called IAM (Identity and Access Management) that handles users, groups, policies, and roles, which can be assumed by different services or resources to talk to each other. Along the way, when building this CI/CD pipeline, you'll have to use roles and make sure you use the least amount of privileges for the right amount of actions. One of the advantages is clearly to pay by the usage of what you need. One of the services I didn't show you is called CodeBuild, which can build your code, and you pay by the time of execution and the number of builds. So, quite interesting as well. Instead of maintaining everything together, Managed Services can help you with that. With that, I'll give you the remote. Before looking at the benchmark results, this is how you can help. If you're involved in the OpenBSD community, helping us improve the speed of those scripts is definitely top of the list. Please get in touch. If you're using FreeBSD, Colleen is our hero. Thank you so much for the hard work on building those AMIs and being part of the FreeBSD core team. He needs your help on testing FreeBSD on AWS, writing documentation, which is always important and sometimes the most difficult part of open source. Any help you can provide on having one-click feedback and instant everything for FreeBSD would be very nice. Today, we have the AMIs, but we would love to have proper packages, proper AMIs, ready to go with WordPress and whatever people like to run on FreeBSD. So, there are plenty of ways to help. Talk to him, flood him, he needs your help. We want to see more FreeBSD on AWS. Let's look at the benchmark results. Okay, let's just look at the numbers. I ran these tests again yesterday. Hopefully, they're complete now. C4 is 11 minutes and 42 seconds. X1 is 11 minutes and 38 seconds. And now all the guys say, "Yeah, I won. I knew the X1 was faster." And i3 is under 11 minutes. So, and it's fun because it's pretty much exactly the numbers I did yesterday. So, this is what we get. i3 wins. This goes to show a few things. NVMe storage is just insane. I knew it was going to be fast, but it's blazingly fast. You would think the X1, with its massive CPU, would destroy everything. But no. My guess is, when I spent a few hours looking at this build process again, the FreeBSD build process is just not parallel enough to leverage those 128 CPUs. There are lots of sequential steps running on a single core, and you waste a lot of time doing that. I don't think it can be helped, but you can see most of the time, you just don't see parallelism on a lot of steps. That's where you would win. That's where you would make a lot of speed. I actually ran these benchmarks this morning because I knew there were going to be tests here. The X1, if you only run with 64 parallelism, takes 10 minutes and 39 seconds. We actually have issues with kernel locking. We're just spending too much time with contention. There's work in head which improves that, and with a head kernel compiling 11.1, we can do it in 8 minutes and 31 seconds. So, there's definitely progress happening there, but it's a scalability issue in the kernel, not just with the build process. I'm using 11.1 release because it's the official AMI right now, but it's going to get faster. It's pretty interesting to see that not the biggest instance wins actually. I ran those same tests, exact same parameters, on a RAM disk. Here are the results. I have a minor improvement on C4, a small improvement on X1, and I get slower with the RAM disk on i3. My only conclusion here is that ZFS is just brilliant. But that's probably my troll for today. The last thing is how much this costs. We pay per hour and soon per second. Here are the prices. Performance is very nice, but at the end of the day, even if you're going to use i3, would you be willing to pay four to five times the hourly price just to gain a few seconds? I don't think so. That's the advice we give to customers all the time. When they ask, "I've got this workload, what instance size, what instance family, what instance size should I pick?" The only reasonable answer is, "Please run your benchmarks. Please run the actual application and figure it out." And then you get that performance level and decide how much you want to pay for that. If absolute speed matters, maybe for another application, you could say, "Every second counts." So, okay, I'm paying that premium here. But the right price point for this use case would be C4. Run your benchmarks. Synthetic benchmarks are nice, but the real testing should happen with the real workload. And then you can see what happens. As a conclusion, we talked about BSD today, but it's just the OS. All our customers on top of that run a crazy amount of open source, from databases to NoSQL to Hadoop, Puppet, Chef, Jenkins, and all the CI/CD tools. All of these, one way or another, work really well on AWS. Some of our services are even based on these pieces of technology, like Amazon RDS for relational databases, where you can pick from MySQL, Postgres, etc. All of these, one way or another, we help you run on your OS. Let's face it, sometimes it's running Linux, but we help BSD run better on AWS, and we don't just stop there. We really want to have as many open source projects running very well on AWS. Keep that in mind and feel free to ask questions later or get us on Twitter. That's my conclusion. Thank you very much for listening. We're really looking forward to having more FreeBSD and maybe a little bit of OpenBSD and NetBSD run on AWS. Thanks again. If you have questions, we'll hang around. Please show up. Thank you.

Running BSD on AWS

Transcript

Tags

About the Author