Evolution of

 

 

            lxc

(LinuX Containers)

Sameer Kandarkar

AGENDA

  • Lxc
  • Why Lxc
  • History
  • Lxc Technology Stack
  • Namespaces (Foundation of Lxc)
  • Resource Management with Cgroups(why containers need cgroups)
  • Subsystems (resource controllers)
  • Demo
  • Q&A

Lxc

Lxc - short for “Linux containers”,

 

Is a solution for virtualizing software at the operating system level within the Linux kernel. Unlike traditional hypervisors (think VMware, KVM and Hyper-V).

LXC lets you run single applications in virtual environments, although you can also virtualize an entire operating system inside a LXC container, if you’d like.

 

LXC’s main advantages include making it easy to control a virtual environment using userspace tools from the host OS, requiring less overhead than a traditional hypervisor and increasing the portability of individual apps by making it possible to distribute them inside containers.

 

 

Sounds similar ?

Why Lxc ?

LXC sounds a lot like Docker or CoreOS containers. ?


It is because LXC used to be the underlying technology that made Docker and CoreOS tick.


Still, LXC was at the origin of the container revolution several years ago, and LXC principles remain central to the way containers are developing.

How It all started

Process Containers

LMCTFY

(1974)

(2000)

(2001)

(2004)

(2005)

(2006)

(2011)

(2013)

(2008)

(2013)

  • 1979: Unix V7 :  chroot system call was introduced, changing the root directory of a process and its children to a new location in the filesystem. This advance was the beginning process isolation .
  • 2000: FreeBSD Jails : Allows administrators to partition a FreeBSD computer system into several independent, smaller systems – called “jails” – with the ability to assign an IP address for each system and configuration.
  • 2001: Linux VServer :  It is a jail mechanism that can partition resources (file systems, network addresses, memory) on a computer system.
  • 2004: Solaris Containers : Combines system resource controls and boundary separation provided by zones, which were able to leverage features like snapshots and cloning from ZFS.
  • 2005: Open VZ (Open Virtuzzo) : This is an operating system-level virtualization technology for Linux which uses a patched Linux kernel for virtualization, isolation, resource management and checkpointing. The code was not released as part of the official Linux kernel.
  • 2006: Process Containers :

    Process Containers (launched by Google in 2006) was designed for limiting, accounting and isolating resource usage (CPU, memory, disk I/O, network) of a collection of processes. It was renamed “Control Groups (cgroups)” a year later and eventually merged to Linux kernel 2.6.24.

  • 2008: LXC : LXC (LinuX Containers) was the first, most complete implementation of Linux container manager. It was implemented in 2008 using cgroups and Linux namespaces, and it works on a single Linux kernel without requiring any patches.

  • 2011: Warden :  using LXC in the
    early stages and later replacing it with its own implementation. Warden can isolate environments on any operating system, running as a daemon and providing an API for container management.

  • 2013: LMCTFY : (Let Me Contain That For You) kicked off in 2013 as an open-source version of Google's container stack, providing Linux application containers.

  • 2013: Docker  : And we know ...

Lxc Technology stack

LXC Technology Stack  LXCs are built on modern kernel features

  • chrootfs : apparent root FS directory
  • namespaces: process based resource isolation –
  • cgroups: limits, prioritization, accounting & control
  • Linux Security Modules (LSM): Mandatory Access Control (MAC)  
  • User space interfaces for kernel functions LXC tools – Tools to isolate process(es) virtualizing kernel resources LXC commoditization
  • LXC virtualization Orchestration & management

 

Chroot :

A chroot on Unix operating systems is an operation that changes the apparent root directory for the current running process and its children.

A program that is run in such a modified environment cannot name (and therefore normally cannot access) files outside the designated directory tree.

LSM:
Linux Security Modules (LSM) is a framework that allows the Linux kernel to support a variety of computer security models while avoiding favoritism toward any single security implementation. Like SELinux and smack.

 

linux Namespaces

(The Foundation of Lxc)

Namespaces are the foundation of lightweight process virtualization . They enable a process and its children to have differnet views of the underlying system.

This is achieved in containers by the addition of unshare() and setns() system calls , and the six new constants flags passed to the clone(),unshare(),and setns() system calls.

 

clone(): -

Creates a new process and attaches it to a new specified namespaces .

 

unshare():

This attaches the current process to a new specified namespace

 

setns():

This attaches a process to an already existing namespace

 

 

linux Namespaces

(The Foundation of Lxc)

There are six namespaces currently in use by Lxc

 

MNT Namespaces : - Specified by the CLONE_NEWNS flag

UTS (Unix timesharing) :- Specified by the CLONE_NEWUTS flag

IPC (Interprocess communication) : - Specified by the CLONE_NEWIPC flag

PID Namespaces :- Specified by the CLONE_NEWPID flag

USER Namespaces :- Specified by the CLONE_NEWUSER flag

NETWORK Namespaces :- Specified by the CLONE_NEWNET flag

 

linux Namespaces

(The Foundation of Lxc)

1. Mount Namespaces :

Mount namespaces control mount points.  provides a separate view of the filesystem mount points for the process and its children.

When mounting or unmouriting a filesystem, the change will be noticed by all processes because they all share the same default namespace.

When the CLONE NEWNS flag is passed to the clone () system call, the new process gets a copy of the calling process mount tree that it can then change without affecting the parent process.

All mounts and unmounts in the default namespace will be visible in the new namespace, but changes in the per-process mount namespaces will not be noticed outside of it.

linux Namespaces

(The Foundation of Lxc)

2. UTS Namespaces :

Unix Timesharing (UTS) namespaces provide isolation for the hostname and domain name, so that each LXC container can maintain its own identifier as returned by the hostname -f command. This is needed for most applications that rely on a properly set hostname.

 

3. IPC Namespaces :

The Interprocess Communication (IPC) namespaces provide isolation for a set of IPC and synchronization facilities. These facilities provide a way of exchanging data and synchronizing the actions between threads and processes. They provide primitives such as semaphores, file locks, and mutexes among others, that are needed to have true process separation in a container.

 

4. PID Namespaces :

The PID namespace provides processes with an independent set of process IDs (PIDs) from other namespaces. PID namespaces are nested, meaning when a new process is created it will have a PID for each namespace from its current namespace up to the initial PID namespace. Hence the initial PID namespace is able to see all processes, albeit with different PIDs than other namespaces will see processes with.

 

 

 

 

linux Namespaces

(The Foundation of Lxc)

5. USER Namespaces :

The user namespaces allow a process inside a namespace to have a different user and group ID than that in the default namespace. In the context of LXC, this allows for a process to run as root inside the container, while having a non-privileged ID outside. This adds a thin layer of security, because braking out for the container will result in a non-privileged user.

6. NET Namespaces :

Network namespaces provide isolation of the networking resources, such as network devices, addresses, routes, and firewall rules. This effectively creates a logical copy of the network stack, allowing multiple processes to listen on the same port from multiple namespaces. This is the foundation of networking in LXC and there are quite a lot of other use cases where this can come in handy.

 

 

 

 

Control groups

cgroups is a Linux kernel feature that limits, accounts for, and isolates the resource usage of a collection of processes.

 

The main differences between cgroups and normal processes that many different hierarchies of control groups may exist simultaneously in one time while normal process tree is always single.

 

This was not a casual step because each control group hierarchy is attached to set of control group subsystems.

 

In the context of Lxc cgroups are quite important , because it makes it possible to assign limits to how much memory, CPU time , or I/O , any given container can use .

Control groups

Linux kernel provides support for following twelve control group subsystems:

 

  • cpuset - assigns individual processor(s) and memory nodes to task(s) in a group;
  • cpu - uses the scheduler to provide cgroup tasks access to the processor resources;
  • cpuacct - generates reports about processor usage by a group;
  • io - sets limit to read/write from/to block devices;
  • memory - sets limit on memory usage by a task(s) from a group;
  • devices - allows access to devices by a task(s) from a group;
  • freezer - allows to suspend/resume for a task(s) from a group;
  • net_cls - allows to mark network packets from task(s) from a group;
  • net_prio - provides a way to dynamically set the priority of network traffic per network interface for a group;
  • perf_event - provides access to perf events) to a group;
  • hugetlb - activates support for huge pages for a group;
  • pid - sets limit to number of processes in a group.

Persistent Storage for Containers

Why Containers Need Persistent Storage ?

 

1. Containers are ephemeral

2. Local storage isn’t enough

3. Storage adapters to the rescue

4. Don’t lose track of metadata

 

Persistent Storage for Containers

Type of Container Storage

 

1. Storage For Containers

2. Storage In Containers

Demo

1.  Linux Containers

 

2.  Container Native Storage

 

 

Demo

1.  Lxc

 

lxc-ls           # List existing containers

# Note: all commands take -n  as parameter to specify the container 
lxc-start        # Start and attach
lxc-start -d     # Start in background
lxc-console      # Attach to running container
lxc-stop

lxc-clone <source> <target>
lxc-create -t <template> -f <config file>
lxc-destroy

lxc-execute -n <name> -- <command>  # Run command in new container
lxc-attach  -n <name> -- <command>  # Run command in running container

lxc-monitor    # Monitor containers for state changes
lxc-info       # Give details on a container


 

Demo

2.  Lxd

 

lxc image list images: | less                         #List all image

 

lxc init [repository:][imagename]    # Create a container but do not start it

 

lxc start [remote:][name]                # Start a stopped container

 

lxc stop [remote:][name]                 # Stop a container

lxc restart [remote:][name]            #  Restart a container

lxc launch images:centos/7/amd64 centos-new
 
lxc exec <name> bash

lxc snapshot <cont_name> snap1

lxc restore centos-new snap1

lxc publish <cont_name> --alias myimage --force

 

 

   

 

 

Demo

3.  rootfs

Creating namespaces with unshare

$ sudo unshare -p -f --mount-proc=$PWD/rootfs/proc chroot rootfs /bin/bash
$ mount -t sysfs sys rootfs/sys
$ mount -o bind /dev rootfs/dev
$ sudo unshare -i -m -n -u -U -r -p -f --mount-proc=$pwd/rootfs/proc chroot rootfs /bin/bash 

 

 

Entering namespaces with nsenter

$ sudo nsenter --pid=/proc/<pid>/ns/pid \
    unshare -f --mount-proc=$PWD/rootfs/proc \
    chroot rootfs /bin/bash

 

 

 

Thank you !