flock2016
LOGS
08:59:56 <jzb> #startmeeting flock2016
08:59:56 <zodbot> Meeting started Tue Aug  2 08:59:56 2016 UTC.  The chair is jzb. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:59:56 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
08:59:56 <zodbot> The meeting name has been set to 'flock2016'
09:00:17 <jzb> welcome everybody, my name is Thomas Cameron, today we're talking about container security
09:00:32 <jzb> Thomas -- global cloud strategy evangelist
09:00:52 <jzb> what we'll be talking about today, a bit about me, a bit about rht, what containers are, how they work, what they are not
09:01:01 <jzb> confusion about when to use containers
09:01:19 <jzb> talk about components that make up container security, linux control groups, docker daemon, and security it provides and linux kernel capabilities
09:01:25 <jzb> security enhanced linux (SELinux)
09:01:34 <jzb> and tips and tricks for security and then draw some conclusions
09:01:47 <jzb> been working in info technology since 1993, started my career out of school as a police officer
09:01:58 <jzb> enjoy forensics, security, weird hybrid, security hybrid,
09:02:19 <jzb> been with rht since 2005, rht security specialist, been in IT long enough, background in MSFT security, Novell, remember IPX SPX
09:02:45 <jzb> spent a lot of time focusing on security in environments like retail, manufacturing, been at rht long enough to know there's always somebody who knows more than I do.
09:02:56 <jzb> rht has been working on containeriation since 2010
09:03:21 <jzb> acquired Makara, became RHT OpenShift, they had technology analguous to containers today
09:03:24 <jzb> called cartridges
09:03:31 <jzb> as tends to happen in open source community
09:03:50 <jzb> even though we liked technology from Makara, community and insdustry showed what docker was doing made a lot of sense.
09:04:03 <jzb> started participating in docker community last time I checked we were #2 contributor upstream
09:04:20 <jzb> not "ooh, we're red hat" but we have a responsiblity to contribute as good stewards in open source
09:04:36 <jzb> docker is doing some amazing things, many companies are all on board with container standardization
09:04:41 <jzb> with docker as standard container format
09:04:54 <jzb> even MSFT has announced they will support container formats
09:05:17 <jzb> docker is technolgoy that allows you to have applications abstracted from and in some isolation from the underlying operating systems.
09:05:45 <jzb> containers can enable incredible application density, since you don't have the overhead of a full os image for each app.
09:05:55 <jzb> Linux control groups also enable maximum utlization of the system.
09:06:03 <jzb> same container can run on different versions of linux
09:06:07 <jzb> - ubuntu on Fedora
09:06:12 <jzb> - CentOS containers on RHEL
09:06:18 <jzb> cats and dogs, living together...
09:06:32 <jzb> can be very precise in resource utlization
09:06:54 <jzb> what are containers not?
09:06:57 <jzb> not the cure to all that ails you
09:07:51 <jzb> containers are not virtualization
09:08:01 <jzb> a lot of people making comparisons, but it's not the same thing.
09:08:10 <jzb> kernel namespaces
09:08:21 <jzb> just a wya to make a global resource appear to be unique to process.
09:08:36 <jzb> can include mount, pid, uts, ipc, network and user namespaces.
09:08:44 <jzb> all processes belong to one of these namespaces
09:09:02 <jzb> let's drill down into what each means
09:09:15 <jzb> mount - makes up processes view of the filesystem hierarchy
09:09:23 <jzb> other mounts can be added for security and convenience.
09:10:13 <jzb> some locations such as /proc/sys are read only as it's not completely virtualized yet
09:10:47 <jzb> inside container only see filesystems that are abstracted through namespace
09:11:06 <jzb> don't want to see the host unless you specifically mount a space on the host operating system
09:11:14 <jzb> the /mount namespaces is how we do abstraction
09:11:21 <jzb> isolate pid numbers inside the house
09:11:23 <jzb> er, host
09:11:37 <jzb> container thinks bash is pid 1
09:11:52 <jzb> process id namespaces tell processes inside container "yes, bash is #1"
09:12:03 <jzb> point is we want the container to not have awareness of what's going on in the host
09:12:30 <jzb> user namespaces map UID and GID so that inside container I can have a container that appears to have root inside container but not outside / in the host.
09:12:43 <jzb> can actually add ranges .. the challenge with user namespaces.
09:13:01 <jzb> if I spin 50 containers, uid 0 or root access .. map to user that spawned container
09:13:12 <jzb> it's isolated, ... layer sharing is hard and needs work in Linux VFS
09:14:18 <jzb> IPC namespaces - what are they? Masking IPCs so table of IPCs appears to be global but is not.
09:14:44 <jzb> in this case run bash, according to what's going on inside container... got zillions of them. IPC - make sure isolated what's going on iside container
09:14:53 <jzb> changing gears, linux control groups.
09:14:56 <jzb> sets of tasks
09:15:14 <jzb> children into hierarchical .. into a control group with limits
09:15:25 <jzb> if sometning happens to process in control group, even if somebody does something fancy
09:15:47 <jzb> forkbomb in a control group, won't take rest of system
09:16:02 <jzb> even if container is compromised, poorly-written code, misbehaved container should not impact host or other containers.
09:16:19 <jzb> systemcontrol status docker
09:16:24 <jzb> shows control groups
09:16:52 <jzb> can navigate through /sys/fs/cgroup pseudo-directory to see processes
09:17:19 <jzb> limit to 100mb of memory, and look at /proc/1/cgroup - see slices allocated to that container, all the same scope
09:17:32 <jzb> if I look at memory.limit_in_bytes
09:17:51 <jzb> did fork bomb, wht happens in limited amount of time that docker image dies and exits back to command prompt
09:18:00 <jzb> if you look in log file, OOM gets invoked for one container.
09:18:09 <jzb> kill container, rest of OS is unaffected.
09:18:32 <jzb> the docker daemon itself
09:18:50 <jzb> is responsible for managing the control groups, orchestrating namespaces, etc.
09:19:00 <jzb> docker does run with root privs, be aware of that
09:19:11 <jzb> only allow trusted users to run docker
09:19:50 <jzb> if using REST API, make sure using latest versions, don't have any vulns exposed, keep systems up to date. make sure have strong auth.
09:20:14 <jzb> linux kernel capabilities
09:20:24 <jzb> historically root user had ability to do anything, complete access to system.
09:20:30 <jzb> breaks root privs into 32 distinct controls
09:20:38 <jzb> can grant only those privs required to do a job
09:20:55 <jzb> can use to take away privs that root user has or grant privs to non-root user.
09:21:09 <jzb> regular user can bind to privileged port, or lot of neat things you can do with linux caps.
09:21:16 <jzb> catchall capability called capability sysadmin
09:25:22 <jzb> #info connection dropped missed several minutes of transcription]
09:25:43 <jzb> SELinux ... stored on filesystem as extended attributes
09:25:46 <jzb> or in memory by kernel
09:26:13 <jzb> selinux_user:selunux_role:selinux_type:mcs:mcs
09:26:22 <jzb> don't use selinux user or role
09:26:32 <jzb> really only care about type
09:26:42 <jzb> mcs extra identifiers
09:26:49 <jzb> even though identical
09:27:12 <jzb> foo type on both - but difference in mcs type, SELinux says they're completely different
09:27:31 <jzb> type enforcement.. policy prevent from interacting
09:27:54 <jzb> netierh processes would be able to access /etc/shadow
09:28:07 <jzb> all processes run in same context on system running docker
09:28:16 <jzb> but on openshift, all processes run in their own context
09:28:26 <jzb> selinux prevents from attacking other containers
09:29:29 <jzb> even if you're running as root, selinux can block from doing malicious things - even trying to do runcon change selinux context
09:30:00 <jzb> even though you're running as root, in openshift context, can't open home, can't run setenforce 0
09:30:18 <jzb> let's talk about setcomp
09:30:23 <jzb> syscall filtering
09:30:31 <jzb> kill, errno, allow, trap, trace
09:30:47 <jzb> can disable system calls like kexec_load, init_module
09:30:49 <jzb> etc.
09:31:03 <jzb> json file that blocks getcwd command
09:31:30 <jzb> run docker run -it and read setcomp - when runs busybox... can't see directory
09:31:53 <jzb> can take core system calls + blacklist and if you make part of cli to launch container you can block those system calls
09:31:56 <jzb> tips and tricks
09:32:12 <jzb> remember that containers at the end of the day just processes running on host.
09:32:26 <jzb> process in place to update containers + update containers.
09:32:37 <jzb> always run at lowest possible privilige
09:32:45 <jzb> mount host as read-only
09:33:49 <jzb> DON't Just download random containers from the internet
09:33:56 <jzb> don't run SSH inside container
09:34:05 <jzb> don't run with root privs
09:34:13 <jzb> don't disable selinux
09:36:07 <jzb> use selinux aware containers
09:36:18 <jzb> don't roll your own containers once and never maintain them
09:36:31 <jzb> need lifecycle management
09:36:40 <jzb> don't run unsupported in Fedora
09:36:44 <jzb> sorry
09:36:50 <jzb> "don't run unsupported in production"
09:37:23 <jzb> containers do leverage cool technolgy in kernel and are _relatively_ secure
09:37:34 <jzb> if done right can be very, very safe
09:37:37 <jzb> questions
09:37:51 <jzb> ask about selinux, actually confining processes in containers?
09:38:18 <jzb> yes, so - right now selinux only cares about process running on host, process on host is confied
09:38:32 <jzb> run a bunch of containers ... not sure what status of selinux in containers is.
09:38:43 <jzb> "not going to happen"
09:38:47 <jzb> not going to namespace selinux
09:40:12 <jzb> is it possible in container to modify system calls?
09:40:23 <jzb> Like syscall mapping?
09:40:30 <jzb> "Sure it can be done, but don't know how to do it."
09:44:15 <jzb> #endmeeting