08:02:38 <jzb> #startmeeting Introduction to Amazon EC2 Container Service
08:02:38 <zodbot> Meeting started Tue Aug  2 08:02:38 2016 UTC.  The chair is jzb. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:02:38 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
08:02:38 <zodbot> The meeting name has been set to 'introduction_to_amazon_ec2_container_service'
08:03:58 <jzb> #info speaker is David Duncan
08:04:02 <jzb> Agneda
08:04:06 <jzb> Cluster Management
08:04:10 <jzb> Benefits
08:04:15 <jzb> Running Services
08:04:23 <jzb> Adding the ECS agent to Fedora Atomic
08:04:31 <jzb> Atomic with ECS in action
08:04:46 <jzb> why containers? easy for microservices, natural platform for microservices
08:05:08 <jzb> what happens in public cloud, standard applications and simpler processes are abstracted away to make it simpler for people working with public cloud procuess
08:05:27 <jzb> building containers is straightforward, what is hard is getting scheduler
08:05:35 <jzb> businesses come to Amazon and want a way to handle that
08:05:44 <jzb> figure out what instances you have available, etc.
08:06:08 <jzb> Start with ec2 instance, is the base for what you're running docker on, inside the docker separate containers, groups of containers, and storage into a task definition.
08:06:23 <jzb> when you run a task definition, solid group for container management.
08:06:31 <jzb> track cpu and resources available for containers.
08:06:59 <jzb> scheduler is responsible for the tasks and their execution, once we have the tasks defined, we use a scheduler, a couple of schedulers to place them where you want.
08:07:20 <jzb> cluster management engine that underlies the service, no requirement to touch that you you just leverage it for utilization
08:07:44 <jzb> what we're going to talk about is the ecs agent, running on instances. What you refer to as a standard instance, becomes the container instance.
08:08:09 <jzb> there is a agent communication service, and api available there. Wraps a lot of docker commands, native access to EC2 container service API.
08:08:45 <jzb> so, basically what we do to coordinate this, provide a key/value store, the heart of the process that keeps the cluster state available across all the container instances. Maintaining that at scale is a big deal.
08:08:50 <jzb> Take a look at how we maintain that.
08:08:59 <jzb> Amazon ECS under the hood:
08:09:16 <jzb> look at simple transactional algorithm, in key / value store, storing only writes that are handled after the last read, right?
08:09:36 <jzb> if you're system reads a key value pair, that becomes the snapshot we base next write on. Never get out of order.
08:10:11 <jzb> if you have multiple clients doing writes/reads, end up where system that's read an n+2 tries to write at n+3, another scheduler that occurs at n+5, only available to write at n+6 step
08:10:31 <jzb> allows to have a combination of events occurring at different locations, ... goes back to cluster manager to maintain that everything stays the same.
08:10:43 <jzb> in action, we have the API, agent / communication service all talking to cluster management engine.
08:10:49 <jzb> all sanity checking done by key/value store.
08:11:07 <jzb> can actually maintain an entire cluster across a span of high latency communication areas.
08:11:13 <jzb> clusters are always in a specific region
08:11:36 <jzb> one of the principles of aws configuration, every region is independent.
08:12:07 <jzb> multi-region, suggest system that is single region, make latency based across for clients that are hitting systems most local to them. Create restrictions based on geo.
08:12:46 <jzb> API, choose whatever scheduler you want to use. Can leverage api (all open source on github) to make the requests for resources. Then run those task, run tasks accordingly.
08:12:57 <jzb> tasks = combination of containers + storage.
08:13:27 <jzb> multiple schedulers, multiple resources, can schedule task by any one of these, could be long-running, could fit for batch, each one could be working at vastly different rates.
08:13:47 <jzb> cluster management, always look back to key/value to make sure freed before deployment
08:14:20 <jzb> if scheduler yellow is trying to access resources ... then there's always going to be [] for that occuring.
08:14:53 <jzb> we get full scale shared state system, provide your own scheduler for, allocate resources as you see necessary. Container instances ... can then be autoscaled, don't have single group, that are associated with cluster resources.
08:15:06 <jzb> all of central control and monitoring is happening through cluster management system.
08:15:42 <jzb> here's some of the scale-out numbers... we do scale up fairly quickly a number of nodes.
08:16:02 <jzb> Flexible container placement - two specific schedulers by default, one is for long-running, another is for batch jobs.
08:16:33 <jzb> designed for use with other AWS services, that it touches and easy to integrate,
08:17:07 <jzb> elastic load balancing, each of your containers present TCP or UDP from container instance, each port can be attached to single load balancer, scale out service, whatever task is ... scale behind single load balancer.
08:18:04 <jzb> agents are open source
08:18:11 <jzb> they're extensible
08:18:23 <jzb> http://github.com/aws/amazon-ecs-agent
08:18:42 <jzb> http://github.com/aws/amazon-ecs-cli
08:18:58 <jzb> about open source -easy to pull through ci/cd can roll right into process of creating containers.
08:19:15 <jzb> can actually provision on process... if you had test vs. product cluster.
08:19:52 <jzb> look closer at what task definition is. A task is something handled by a container. If you're looking at how that .. is going to run, it's going to be found in task defintion. Also includes storage
08:20:20 <jzb> you can actually identify how storage is associated with particular container and run command in that definition, altogether will define resources by scheduler in that task
08:20:30 <jzb> task can be ebs, elastic file system, no s3
08:20:48 <jzb> can you bring your own storage layer?
08:21:00 <jzb> Yes - can bring gluster or your own
08:21:04 <jzb> "Gluster is a great fit"
08:21:09 <jflory7> jzb: Please use #meetingname flock2016 before ending the meeting, it will sort all Fedora Flock talks together in Meetbot. :)
08:21:32 <jzb> looking at graphic interface, defininig container name, image, max memory, max cpu, port nappings
08:21:36 <jzb> jflory7: ack
08:21:52 <jzb> jflory7: what if I used a name when starting?
08:22:03 <jzb> can also be json file
08:22:22 <jflory7> jzb: #meetingname always overrides for how it's saved in the logs - the name in the meeting is sort of just like an HTML header if you end up using #meetingname later on.
08:22:29 <jzb> ok
08:22:43 <jzb> #meetingname flock2016
08:22:43 <zodbot> The meeting name has been set to 'flock2016'
08:22:57 <jzb> how many units associated with any given tasks and then autoscale accordingly
08:23:16 <jzb> task defines unit of work, associated with specific containers one or more.
08:23:27 <jzb> actual resources ... with particular target
08:23:43 <jzb> create a service - good for long-running applications and services.
08:23:51 <jzb> create a service, you want to keep running at all times.
08:24:02 <jzb> associate with a specific load balancer, that will scale out the back end.
08:24:41 <jzb> anthter thing is , have container instance you've defined
08:24:55 <jzb> in context of atomic, going in with ostree and literally moving to a different version of the environment.
08:25:26 <jzb> we can take a machine image for new environment and add that to our stack. Now we actually have a standard red light/green light, multiple auto-scaling groups.
08:25:42 <jzb> can drain connections out of old systems, and once they're completely drained, go to full new deployment.
08:26:07 <jzb> now, all of this is the new infra. All scaled down, none of the connections were dropped, we just removed dns entries associated with older instances.
08:26:13 <jzb> once there's no service load, they can just go away.
08:26:30 <jzb> Now we're just going to look @ code
08:26:37 <jzb> grabbing most recent atomic instance
08:26:52 <jzb> grabbing -1 is the easiest way if you sort by creation date.
08:27:04 <jzb> once we have that machine image, runinstance command.
08:27:09 <jzb> whatever size is appropriate
08:27:38 <jzb> generally choose a small, because a sustained bandwidth - something measurable, rather than something subject to ?
08:28:53 <jzb> can schedule so all instances are turned off at specific time per day.
08:29:44 <jzb> grab ecs agent from docker registry, easy to do. Once I have that, now I can create ECS optimized AMI
08:29:58 <jzb> used a couple of directions with CLI
08:30:05 <jzb> there's a create-image call don't think I recorded on slide.
08:30:40 <jzb> current version of atomic, ecs image... is going to maintain that. A couple of commands, associated on github repo, couple of directories to be created for logging.
08:30:56 <jzb> Once you have those created you want to create the image.
08:31:05 <jzb> then run create image, for particular instance.
08:31:36 <jzb> changed behavior to stop from terminate in case anything went wrong still have image to troubleshoot
08:32:05 <jzb> have generated small cli skeleton, structure for ... parameters, can place that in template, that template can be used to run command consistently.
08:32:26 <jzb> once you have skeleton, can manage with boto
08:32:34 <jzb> can manage it programmatically
08:33:15 <jzb> taking that instance we updated, image ID ... associated with the volume and the instance definition.
08:33:30 <jzb> if you allocate more storage, those devices will be recreated.
08:34:24 <jzb> creating instance, could have done through autoscaling, but chose to do as single instance because it's more clear
08:34:39 <jzb> one instance with default config, for this particular region.
08:34:50 <jzb> shell in, start container, docker run on configuration.
08:35:02 <jzb> this is the important part, there's a connnection to unix socket
08:35:06 <jzb> requires to run privileged
08:35:13 <jzb> "not a solution, it's a workaround.
08:35:22 <jzb> --privileged --net=host required
08:35:30 <jzb> or agent will fail + fail repeatedly
08:36:08 <jzb> ECS_CLUSTER defines with a specific cluster
08:36:16 <jzb> if you don't set that, it will set with Default Cluster
08:37:01 <jzb> specific container instance, assign specific instances, define metrics to scale.
08:47:32 <jzb> #endmeeting