How to create a container host overlay network with Open vSwitch [Linux]

In the past few weeks I’ve evaluated a container based approach to host all my web and internet services, such as my metalhead.club Mastodon Instance and Wordpress blog instances of relatives and friends. With this container based setup comes not only more abstraction, but also new challenges. My LXD based setup makes use of many Debian LXC containers, which need to be maintained and monitored. Every container should get a unique IP address in a virtual overlay network, which is used across all container host machines. Management tools like Ansible could directly communicate with each of the containers, no matter on which host they reside. The aim of this blog post is to present my networking solution based on Open vSwitch.

The architecture

To get a better overview of the target network architecture, I’ve created a figure which shows my idea of a management network:

Container overlay network figure

The networking bridge “cont-mgmnt0” with its own subnet should behave like there was just a single host. Hosts and their physical network connection shall be transparent to the container management bridge (and the containers attached). The key term for an architecture like this is “Software defined network (SDN)”. Open vSwitch provides a virtual, host-transparent networking switch (ovsbr0), to which the cont-mgmnt0 bridge and the host-mgmnt0 bridges are attached. In addition Open vSwitch makes sure that …

  • The connection between both “switch parts” is tunneled via GRE
  • The connection between both hosts is encrypted using IPsec
  • The cont-mgmnt0 network and the host-mgmnt0 network are virtually seperated via VLAN.

(Note: In OVS terminology a “virtual ethernet switch” is a “bridge”, thus our virtual switch is named “ovsbr0”)

In this guide I’m using LXD as a container management engine. Of course this setup can also used with other technologies like Docker.

1) Create a container networking bridge on each host

The first step is to make sure that containers on the same host can see each other via a dedicated management network. So on each host a new network bridge “cont-mgmnt0” is created:

apt install bridge-utils
brctl addbr cont-mgmnt0 up
ip link set dev cont-mgmnt0 up

You can make the bridge permanent like this:

/etc/network/interfaces:

# Container bridge for OVS
auto cont-mgmnt0
iface cont-mgmnt0 inet6 manual
    bridge_ports none
    bridge_stp on

The new bride can now be attached to each of the containers on the corresponding host (not relevant if LXD is not used), e.g let’s attach a static IP address “10.8.2.2” to the container named “wallabag”:

lxc config device add wallabag eth1 nic nictype=bridged parent=cont-mgmnt0 name=eth1
lxc exec wallabag -- ip addr add 10.8.2.2/24 dev eth1
lxc exec wallabag -- ip link set dev eth1 up

(The cont-mgmnt0 interface will be “eth1” inside the container.)

Repeat these steps for every host and every container. Containers on the same host should now be able to ping each other. Pings across hosts will not work, yet.

2) Create virtual ethernet switches on hosts

To connect the hosts and provide a host independent ethernet switch, let’s utilize Open vSwitch. A new virtual switch (“ovsbr0”) is created:

apt-get install openvswitch-switch openvswitch-ipsec
ovs-vsctl add-br ovsbr0

The ethernet MTU of the switch must be reduced because of all the tunneling and IPsec/VLAN overhead (see next steps):

ovs-vsctl set int ovsbr0 mtu_request=1416

3) Attach management bridge to virtual switch and connect switches

The cont-mgmnt0 bridge on each host gets connected to the virtual switch. A VLAN tag “200” is applied to seperate the container network from other networks (e.g. the host-mgmnt0 network). On both hosts:

ovs-vsctl add-port ovsbr0 cont-mgmnt0 tag=200

In the next step both host switches are connected to each other:

# Host 1:
ovs-vsctl add-port ovsbr0 gre0 -- set interface gre0 type=gre options:remote_ip=1.1.1.1 options:psk=mykey

# Host 2:
ovs-vsctl add-port ovsbr0 gre0 -- set interface gre0 type=gre options:remote_ip=2.2.2.2 options:psk=mykey

“mykey” is the passphrase used to establish a secured IPsec connection. The IP addresses correspond to those of the other host.

4) Create a host management network [optional]

If you like to connect your hosts an a seperate network, create another networking bridge “host-mgmnt0” via the following command (on both host):

ovs-vsctl add-port ovsbr0 host-mgmnt0 -- set interface host-mgmnt0 type=internal -- set port host-mgmnt0 tag=100

A new network interface “host-mgmnt0” will be available on both of your hosts. This network uses the same GRE/IPsec tunnel and switch provided by Open vSwitch, but it is seperated from the container network via VLAN (See tag “100”). To be able to use the network interface, attach IP-addresses to the new interface on both hosts:

# Host 1:
ip addr add 10.8.3.1/24 dev host-mgmnt0
ip link set dev host-mgmnt0 up

# Host 2:
ip addr add 10.8.3.2/24 dev host-mgmnt0
ip link set dev host-mgmnt0 up

The result

Every container is now able to connect to any other container on the same cont-mgmnt0 network - no matter on which host the containers run:

# From 10.8.2.2 on Host 1 to 10.8.2.3 on Host 2
ping 10.8.2.3

If set up, hosts can ping each other on their own network host-mgmnt0:

# Host 1: 
ping 10.8.3.2

# Host 2:
ping 10.8.3.1