Okay, my load balancer is ready. I can see my three instances are in service. Now, if I copy that DNS name here and connect, I should be load balancing across all my instances. The problem is, the payload is identical in each instance, so we see nothing to indicate that load balancing could be hitting the same instance again. Let's fix that. Let's go back to my compose file and deploy a fantastic version 2 of this application. Just change the tag in the compose file and redeploy the service. That's all I need to do. As you will see, ECS picks up the new task definition and creates a new one, version 7. Obviously, all containers will be stopped and replaced with new ones. Let's look at this in the console. That was fast; it killed all the old ones and is starting a new one. This is a dangerous situation because my service is down, and all the version 6 containers are dead, while the version 7 containers are not starting yet. Why is that? It's because I set the parameter `minimum healthy percent` to zero. This means the percentage of older containers that should always be running during deployment. Setting this to zero means ECS will kill them all to make room for the new ones. For a dev environment, that's fine; it's the fastest way. For a production environment, you definitely do not want to do this because, as you can see, for a few seconds, my service is down until the newer ones are running. Do not set this to zero in production unless you're ready to tolerate downtime. It will be short, but it's still downtime, especially in a microservice architecture where you probably don't want to do that. Depending on how many instances are running and how fast you want the upgrade to be, you can set this between zero and 100. If you want to do a green-blue deployment, you could keep this at 100%, set the maximum at 200%, so at the same time, you would have, let's say, three containers of version 6 running and three containers of version 7 running, allowing you to do testing. However, in this case, you would need more instances because each container runs on a fixed port 80, and there is only one port 80 per instance. If you want static ports and 100% of the older containers and 100% of the newer containers, you need twice the number of instances. It's a balance between the size of the cluster, the speed of deployment, and whether you want to do a green-blue or a rolling upgrade. Once again, if you set this to zero, your service will be down, even if only for a few seconds.
My newer containers are running now. If I go to `index2.php`, which is the new page in version 2, I see the container hostname, and now I can see this is balanced. Sapphire is doing the same. But you get the point; it's load balanced. So, that pretty much sums up the demo, where we looked extensively at creating a cluster, deploying a service, scaling a service, and explored a lot of EC2 features. The only thing left to do is to shut down the service. I can do `stop`, which should take the number of tasks for that service to zero. ECS is going to kill all those containers. Killing is always faster than launching, isn't it? I could do `delete` if I wanted to, which would delete the service itself, the task definition, etc. If I look at the cluster now, the service is gone. Finally, I need to delete the load balancer; otherwise, it will complain since that was a manual operation. It should be gone in the auto-scaling group as well. There's a tiny console bug, but that's fine. No more load balancer. Now, I can just use the `down` command to delete it, which will delete the instances, the VPC, and run the delete stack command on CloudFormation. That takes a few minutes. That concludes demo 1.
To summarize, we saw how to use ECR, build an image, push an image, deploy it, scale it, and load balance it, while looking at the full roster of EC2 features. We created a cluster, launched a service, but we could have launched multiple services. Everything was running on a fixed port, which is good to use with the ELB, which likes to have fixed ports to check. However, the downside is that there is only one port 80 per instance, so if you use fixed ports, you can only run one container for a given service on a given instance. We have three copies of service one on different instances, and the same goes for service two and service three. This can be a limitation if you need 20 copies of service one, requiring 20 instances. In a microservice architecture, this gets worse because you have a platform built from 20, 30, or 50 microservices, each needing to be deployed and fixed independently, creating a dynamic environment.
In a microservice environment, you need to answer several questions: Can you deploy and scale each microservice independently? Can you run multiple copies of a service on the same server? How can a microservice register its name and port automatically? How do microservices discover each other? How can you load balance them efficiently? We saw that a microservice can be deployed and scaled independently on ECS as long as it has a separate Docker image, task definition, and service definition. Running multiple copies of a microservice on the same EC2 instance is a challenge with fixed ports, so you need to let Docker assign a random port. For service discovery, we use Registrator, which detects starting containers, grabs their IP address, name, and port, and sends that information to Consul, a distributed key-value server. Consul acts as the central authority for service discovery, providing IP addresses and ports. For DNS resolution, we deploy a Consul agent on each instance, running on port 53, which forwards normal DNS requests to an external DNS server but uses Consul for service resolution.
For load balancing, we have internal services and user-facing services. Internal services use DNS lookups, which return SRV records with a TTL of 0, providing the IP address and port number. For user-facing services, we use Fabio, a zero-configuration HTTP/HTTPS router, running on a fixed port. Fabio grabs Consul traffic and builds a routing table dynamically, allowing the ELB to load balance traffic to user-facing services.
To summarize the setup, we have three instances, each with Fabio, the ECS agent, Registrator, and Consul on port 53. When a user sends a request, it goes through the Internet Gateway to the ELB, which picks a Fabio. Fabio, using Consul information, picks a portal service, which then does DNS lookups to access internal services like Stock and Weather. All this information is dynamically updated when containers start, making the system fully dynamic. The only weak point is that the Consul server is standalone, and in a production environment, you would want to add high availability. That's it for now, and we'll jump into the second demo to see all this in action.