***Resources*** In depth view on Docker Networking

Network Namespaces

The Following Command Will Creates 4 Network Namespaces And Veth Pairs

for i in {1..4}; do
    sudo ip netns add ns\$i
    sudo ip link add veth\$i type veth peer name pveth\$i
    sudo ip link set veth\$i netns ns\$i
    sudo ip netns exec ns\$i ip link set dev veth\$i up
    sudo ip link set dev pveth\$i up
done
flag explanation
or i in {1…4}: iterates over the numbers 1 to 4
-p netns add ns$i: creates a network namespace for each iteration (1 to 4)
-p link add veth$i type veth peer name pveth$i: creates a veth pair named veth$i and pveth$i
-p link set veth$i netns ns$i: moves the veth$i interface into the network namespace ns$i
-p netns exec ns$i ip link set dev veth$i up: brings the veth$i interface up in the network namespace `ns$i
-p link set dev pveth$i up: brings the pveth$i interface up in the default network namespace

The Following Command Will Sets Up Ip Forwarding

sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv6.conf.all.forwarding=1
flag explanation
1 enables IP forwarding
0 disables IP forwarding

The Following Command Will Creates A Virtual Switch Called Br1.

sudo ip link add dev br1 type bridge
flag explanation
dev br1 specifies the name of the interface
type bridge specifies the type of the interface

The Following Command Will Enables The Virtual Switch.

sudo ip link set dev br1 up
flag explanation
dev br1 specifies the name of the interface

The Following Command Will Connects The Pveth1 Interface To The Br1 Virtual switch.

sudo ip link set dev pveth1 master br1
flag explanation
dev pveth1 specifies the name of the interface
master br1 specifies the name of the master interface

Iptables

The Following Command Will Sets Up Masquerade For The 192.168.1.0/24 Network.

iptables -t nat -A POSTROUTING -s 192.168.1.0/24 ! -d 192.168.1.0/24 -j MASQUERADE
flag explanation
-t nat specifies the NAT table
-A POSTROUTING appends the rule to the POSTROUTING chain in the NAT table
-s 192.168.1.0/24 matches source addresses in the 192.168.1.0/24 network
-d 192.168.1.0/24 matches destination addresses that are not in the 192.168.1.0/24 network
-j MASQUERADE specifies that matching packets should be masqueraded

Allows incoming SSH connections to the host

iptables -A INPUT -i br1 -p tcp --dport 22 -j ACCEPT
flag explanation
-A INPUT appends the rule to the INPUT chain
-i br1 matches packets arriving at the br1 interface
-p tcp matches TCP packets
-dport 22 matches destination port 22
-j ACCEPT specifies that matching packets should be accepted

The Following Command Will Allows Incoming Icmp Requests (Ping) To The Host

iptables -A INPUT -i br1 -p icmp --icmp-type echo-request -j ACCEPT
flag explanation
-A INPUT appends the rule to the INPUT chain
-i br1 matches packets arriving at the br1 interface
-p icmp matches ICMP packets
-icmp-type echo-request: matches ICMP echo requests (ping)
-j ACCEPT specifies that matching packets should be accepted

The Following Command Will Drops All Other Incoming Packets To The Host

iptables -A INPUT -i br1 -j DROP
flag explanation
-A INPUT appends the rule to the INPUT chain
-i br1 matches packets arriving at the br1 interface
-j DROP specifies that matching packets should be dropped

Docker Networking

For Docker containers to communicate with each other and the outside world via the host machine, there has to be a layer of networking involved. Docker supports different types of networks, each fit for certain use cases.

What are different types of Networking in Docker

Docker comes with network drivers geared towards different use cases. Docker’s networking subsystem is pluggable, using drivers.

What is docker0 in terms of Docker Networking

When Docker is installed, a default bridge network named docker0 is created. Each new Docker container is automatically attached to this network, unless a custom network is specified.

Besides docker0, two other networks get created automatically by Docker: host(no isolation between host and containers on this network, to the outside world they are on the same network) and none(attached containers run on container-specific network stack)

  1. Host networks

Using host network driver for a container, that container’s network stack is not isolated from the Docker host, and use the host’s networking directly.
Host is only available for swarm services on Docker 17.06 and higher.
The host networking driver only works on Linux hosts, and is not supported on Docker for Mac, Docker for Windows, or Docker EE for Windows Server.

  1. Bridge networks

The default network driver. If you don’t specify a driver, this is the type of network you are creating. Bridge networks are usually used when your applications run in standalone containers that need to communicate. A bridge network uses a software bridge which allows containers connected to the same bridge network to communicate, while providing isolation from containers which are not connected to that bridge network.

  1. Macvlan networks

Legacy applications expect to be directly connected to the physical network, rather than routed through the Docker host’s network stack. Macvlan networks assign a MAC address to a container, making it appear as a physical device on your network. The Docker daemon routes traffic to containers by their MAC addresses. We need to designate a physical interface on our Docker host to use for the Macvlan, as well as the subnet and gateway of the Macvlan.

  1. None networks

    This mode will not configure any IP to the container and doesn’t have any access to the external network as well as to other containers. It does have the loopback address and can be used for running batch jobs.

  2. Overlay networks

    You have multiple docker host running containers in which each docker host has its own internal private bridge network allowing the containers to communicate with each other however, containers across the host has no way to communicate with each other unless you publish the port on those containers and set up some kind of routing yourself. This is where Overlay network comes into play. With docker swarm you can create an overlay network which will create an internal private network that spans across all the nodes participating in the swarm network as we could attach a container or service to this network using the network option while creating a service. So, the containers across the nodes can communicate over this overlay network.

Introduction to MacVLAN

The macvlan driver is the newest built-in network driver and offers several unique characteristics. It’s a very lightweight driver, because rather than using any Linux bridging or port mapping, it connects container interfaces directly to host interfaces. Containers are addressed with routable IP addresses that are on the subnet of the external network.

As a result of routable IP addresses, containers communicate directly with resources that exist outside a Swarm cluster without the use of NAT and port mapping. This can aid in network visibility and troubleshooting. Additionally, the direct traffic path between containers and the host interface helps reduce latency. macvlan is a local scope network driver which is configured per-host. As a result, there are stricter dependencies between MACVLAN and external networks, which is both a constraint and an advantage that is different from overlay or bridge.

The macvlan driver uses the concept of a parent interface. This interface can be a host interface such as eth0, a sub-interface, or even a bonded host adaptor which bundles Ethernet interfaces into a single logical interface. A gateway address from the external network is required during MACVLAN network configuration, as a MACVLAN network is a L2 segment from the container to the network gateway. Like all Docker networks, MACVLAN networks are segmented from each other – providing access within a network, but not between networks.

The macvlan driver can be configured in different ways to achieve different results. In the below example we create two MACVLAN networks joined to different subinterfaces. This type of configuration can be used to extend multiple L2 VLANs through the host interface directly to containers. The VLAN default gateway exists in the external network.

The db and web containers are connected to different MACVLAN networks in this example. Each container resides on its respective external network with an external IP provided from that network. Using this design an operator can control network policy outside of the host and segment containers at L2. The containers could have also been placed in the same VLAN by configuring them on the same MACVLAN network. This just shows the amount of flexibility offered by each network driver.

Bridge networks

The Bridge network connect two networks while creating a single aggregate network from multiple communication networks or network segments, hence the name bridge.

The bridge driver creates a private network internal to the host so containers on this network can communicate. External access is granted by exposing ports to containers. Docker secures the network by managing rules that block connectivity between different Docker networks.

Behind the scenes, the Docker Engine creates the necessary Linux bridges, internal interfaces, iptables rules, and host routes to make this connectivity possible. In the example highlighted below, a Docker bridge network is created and two containers are attached to it. With no extra configuration the Docker Engine does the necessary wiring, provides service discovery for the containers, and configures security rules to prevent communication to other networks. A built-in IPAM driver provides the container interfaces with private IP addresses from the subnet of the bridge network.

The above application is now being served on our host at port 8000. The Docker bridge is allowing web to communicate with db by its container name. The bridge driver does the service discovery for us automatically because they are on the same network. All of the port mappings, security rules, and pipework between Linux bridges is handled for us by the networking driver as containers are scheduled and rescheduled across a cluster.

The bridge driver is a local scope driver, which means it only provides service discovery, IPAM, and connectivity on a single host. Multi-host service discovery requires an external solution that can map containers to their host location. This is exactly what makes the overlay driver so great.

Overlay networks

The Overlay Network are usually used to create a virtual network between two separate hosts. Virtual, since the network is build over an existing network.

The built-in Docker overlay network driver radically simplifies many of the complexities in multi-host networking. It is a swarm scope driver, which means that it operates across an entire Swarm or UCP cluster rather than individual hosts. With the overlay driver, multi-host networks are first-class citizens inside Docker without external provisioning or components. IPAM, service discovery, multi-host connectivity, encryption, and load balancing are built right in. For control, the overlay driver uses the encrypted Swarm control plane to manage large scale clusters at low convergence times.

The overlay driver utilizes an industry-standard VXLAN data plane that decouples the container network from the underlying physical network (the underlay). This has the advantage of providing maximum portability across various cloud and on-premises networks. Network policy, visibility, and security is controlled centrally through the Docker Universal Control Plane (UCP).

External Media

In the above example we are still serving our web app on port 8000 but now we have deployed our application across different hosts. If we wanted to scale our web containers, Swarm & UCP networking would load balance the traffic for us automatically.

The overlay driver is a feature-rich driver that handles much of the complexity and integration that organizations struggle with when crafting piecemeal solutions. It provides an out-of-the-box solution for many networking challenges and does so at scale.

The Container Networking Model

The Docker networking architecture is built on a set of interfaces called the Container Networking Model (CNM). The philosophy of CNM is to provide application portability across diverse infrastructures. This model strikes a balance to achieve application portability and also takes advantage of special features and capabilities of the infrastructure.

CNM Constructs

There are several high-level constructs in the CNM. They are all OS and infrastructure agnostic so that applications can have a uniform experience no matter the infrastructure stack.

app-policy

  • Sandbox — A Sandbox contains the configuration of a container’s network stack. This includes management of the container’s interfaces, routing table, and DNS settings. An implementation of a Sandbox could be a Linux Network Namespace, a FreeBSD Jail, or other similar concept. A Sandbox may contain many endpoints from multiple networks.
  • Endpoint — An Endpoint joins a Sandbox to a Network. The Endpoint construct exists so the actual connection to the network can be abstracted away from the application. This helps maintain portability so that a service can use different types of network drivers without being concerned with how it’s connected to that network.
  • Network — The CNM does not specify a Network in terms of the OSI model. An implementation of a Network could be a Linux bridge, a VLAN, etc. A Network is a collection of endpoints that have connectivity between them. Endpoints that are not connected to a network will not have connectivity on a Network.

Tutorial Application: The Pets App

In the following example, we will use a fictional app called Pets to illustrate the Network Deployment Models. It serves up images of pets on a web page while counting the number of hits to the page in a backend database. It is configurable via two environment variables, DB and ROLE.

  • DB specifies the hostname:port or IP:port of the db container for the web front end to use.
  • ROLE specifies the “tenant” of the application and whether it serves pictures of dogs or cat.

It consists of web, a Python flask container, and db, a redis container. Its architecture and required network policy is described below.

We will run this application on different network deployment models to show how we can instantiate connectivity and network policy. Each deployment model exhibits different characteristics that may be advantageous to your application and environment.

We will explore the following network deployment models in this section:

  • Bridge Driver
  • Overlay Driver
  • MACVLAN Bridge Mode Driver

Tutorial App: Bridge Driver

This model is the default behavior of the built-in Docker bridge network driver. The bridge driver creates a private network internal to the host and provides an external port mapping on a host interface for external connectivity.

#Create a user-defined bridge network for our application
$ docker network create -d bridge catnet

#Instantiate the backend DB on the catnet network
$ docker run -d --net catnet --name cat-db redis

#Instantiate the web frontend on the catnet network and configure it to connect to a container named `cat-db`
$ docker run -d --net catnet -p 8000:5000 -e 'DB=cat-db' -e 'ROLE=cat' chrch/web

When an IP address is not specified, port mapping will be exposed on all interfaces of a host. In this case the container’s application is exposed on 0.0.0.0:8000. We can specify a specific IP address to advertise on only a single IP interface with the flag -p IP:host_port:container_port. More options to expose ports can be found in the Docker docs.

The web container takes some environment variables to determine which backend it needs to connect to. Above we supply it with cat-db which is the name of our redis service. The Docker Engine’s built-in DNS will resolve a container’s name to its location in any user-defined network. Thus, on a network, a container or service can always be referenced by its name.

With the above commands we have deployed our application on a single host. The Docker bridge network provides connectivity and name resolution amongst the containers on the same bridge while exposing our frontend container externally.

$ docker network inspect catnet
[
  {
    "Name": "catnet",
    "Id": "81e45d3e3bf4f989abe87c42c8db63273f9bf30c1f5a593bae4667d5f0e33145",
    "Scope": "local",
    "Driver": "bridge",
    "EnableIPv6": false,
    "IPAM": {
      "Driver": "default",
      "Options": {},
      "Config": [
        {
          "Subnet": "172.19.0.0/16",
          "Gateway": "172.19.0.1/16"
        }
      ]
    },
    "Internal": false,
    "Attachable": false,
    "Containers": {
      "2a23faa18fb33b5d07eb4e0affb5da36449a78eeb196c944a5af3aaffe1ada37": {
        "Name": "backstabbing_pike",
        "EndpointID": "9039dae3218c47739ae327a30c9d9b219159deb1c0a6274c6d994d37baf2f7e3",
        "MacAddress": "02:42:ac:13:00:03",
        "IPv4Address": "172.19.0.3/16",
        "IPv6Address": ""
      },
      "dbf7f59187801e1bcd2b0a7d4731ca5f0a95236dbc4b4157af01697f295d4528": {
        "Name": "cat-db",
        "EndpointID": "7f7c51a0468acd849fd575adeadbcb5310c5987195555620d60ee3ffad37c680",
        "MacAddress": "02:42:ac:13:00:02",
        "IPv4Address": "172.19.0.2/16",
        "IPv6Address": ""
      }
    },
    "Options": {},
    "Labels": {}
  }
]

In this output, we can see that our two containers have automatically been given ip addresses from the 172.19.0.0/16 subnet. This is the subnet of the local catnet bridge, and it will provide all connected containers a subnet from this range unless they are statically configured.

Tutorial App: Multi-Host Bridge Driver Deployment

Deploying a multi-host application requires some additional configuration so that distributed components can connect with each other. In the following example we explicitly tell the web container the location of redis with the environment variable DB=hostB:8001. Another change is that we are port mapping port 6379 inside theredis container to port 8001 on the hostB. Without the port mapping, redis would only be accessible on its connected networks (the default bridge in this case).

host-A $ docker run -d -p 8000:5000 -e 'DB=host-B:8001' -e 'ROLE=cat' --name cat-web chrch/web
host-B $ docker run -d -p 8001:6379 redis

In this example we don’t specify a network to use, so the default Docker bridge network will exist on every host.

When we configure the location of redis at host-B:8001, we are creating a form of service discovery. We are configuring one service to be able to discover another service. In the single host example, this was done automatically because Docker Engine provided built-in DNS resolution for the container names. In this multi-host example we are doing this manually.

  • cat-web makes a request to the redis service at host-B:8001
  • On host-A the host-B hostname is resolved to host-B’s IP address by the infrastructure’s DNS
  • The request from cat-web is masqueraded to use the host-A IP address.
  • Traffic is routed or bridged by the external network to host-B where port 8001 is exposed.
  • Traffic to port 8001 is NATed and routed on host-B to port 6379 on the cat-db container.

The hardcoding of application location is not typically recommended. Service discovery tools exist that provide these mappings dynamically as containers are created and destroyed in a cluster. The overlay driver provides global service discovery across a cluster. External tools such as Consul and etcd also provide service discovery as an external service.

In the overlay driver example we will see that multi-host service discovery is provided out of the box, which is a major advantage of the overlay deployment model.

Bridge Driver Benefits and Use-Cases

  • Very simple architecture promotes easy understanding and troubleshooting
  • Widely deployed in current production environments
  • Simple to deploy in any environment from developer laptops to production data center

Tutorial App: Overlay Driver

This model utilizes the built-in overlay driver to provide multi-host connectivity out of the box. The default settings of the overlay driver will provide external connectivity to the outside world as well as internal connectivity and service discovery within a container application. The Overlay Driver Architecture section reviews the internals of the Overlay driver which you should review before reading this section.

In this example we are re-using the previous Pets application. Prior to this example we already set up a Docker Swarm. For instructions on how to set up a Swarm read the Docker docs. When the Swarm is set up, we can use the docker service create command to create containers and networks that will be managed by the Swarm.

The following shows how to inspect your Swarm, create an overlay network, and then provision some services on that overlay network. All of these commands are run on a UCP/swarm controller node.

#Display the nodes participating in this swarm cluster
$ docker node ls
ID                           HOSTNAME          STATUS  AVAILABILITY  MANAGER STATUS
a8dwuh6gy5898z3yeuvxaetjo    host-B  Ready   Active
elgt0bfuikjrntv3c33hr0752 *  host-A  Ready   Active        Leader

#Create the dognet overlay network
$ docker network create -d overlay --subnet 10.1.0.0/24 --gateway 10.1.0.1 dognet

#Create the backend service and place it on the dognet network
$ docker service create --network dognet --name dog-db redis

#Create the frontend service and expose it on port 8000 externally
$ docker service create --network dognet -p 8000:5000 -e 'DB=dog-db' -e 'ROLE=dog' --name dog-web chrch/web

We pass in DB=dog-db as an environment variable to the web container. The overlay driver will resolve the service name dog-db and load balance it to containers in that service. It is not required to expose the redis container on an external port because the overlay network will resolve and provide connectivity within the network.

Inside overlay and bridge networks, all TCP and UDP ports to containers are open and accessible to all other containers attached to the overlay network.

The dog-web service is exposed on port 8000, but in this case the routing mesh will expose port 8000 on every host in the Swarm. We can test to see if the application is working by going to <host-A>:8000 or <host-B>:8000 in the browser.

Complex network policies can easily be achieved with overlay networks. In the following configuration, we add the cat tenant to our existing application. This will represent two applications using the same cluster but requirE network micro-segmentation. We add a second overlay network with a second pair of web and redis containers. We also add an admin container that needs to have access to both tenants.

To accomplish this policy we create a second overlay network, catnet, and attach the new containers to it. We also create the admin service and attach it to both networks.

$ docker network create -d overlay --subnet 10.2.0.0/24 --gateway 10.2.0.1 catnet
$ docker service create --network catnet --name cat-db redis
$ docker service create --network catnet -p 9000:5000 -e 'DB=cat-db' -e 'ROLE=cat' --name cat-web chrch/web
$ docker service create --network dognet --network catnet -p 7000:5000 -e 'DB1=dog-db' -e 'DB2=cat-db' --name admin chrch/admin

This example uses the following logical topology:

  • dog-web and dog-db can communicate with each other, but not with the cat service.
  • cat-web and cat-db can communicate with each other, but not with the dog service.
  • admin is connected to both networks and has reachability to all containers.

Overlay Benefits and Use Cases

  • Very simple multi-host connectivity for small and large deployments
  • Provides service discovery and load balancing with no extra configuration or components
  • Useful for east-west micro-segmentation via encrypted overlays
  • Routing mesh can be used to advertise a service across an entire cluster

Tutorial App: MACVLAN Bridge Mode

There may be cases where the application or network environment requires containers to have routable IP addresses that are a part of the underlay subnets. The MACVLAN driver provides an implementation that makes this possible. As described in the MACVLAN Architecture section, a MACVLAN network binds itself to a host interface. This can be a physical interface, a logical sub-interface, or a bonded logical interface. It acts as a virtual switch and provides communication between containers on the same MACVLAN network. Each container receives a unique MAC address and an IP address of the physical network that the node is attached to.

In this example, the Pets application is deployed on to host-A and host-B.

#Creation of local macvlan network on both hosts
host-A $ docker network create -d macvlan --subnet 192.168.0.0/24 --gateway 192.168.0.1 -o parent=eth0 macvlan
host-B $ docker network create -d macvlan --subnet 192.168.0.0/24 --gateway 192.168.0.1 -o parent=eth0 macvlan

#Creation of web container on host-A
host-A $ docker run -it --net macvlan --ip 192.168.0.4 -e 'DB=dog-db' -e 'ROLE=dog' --name dog-web chrch/web

#Creation of db container on host-B
host-B $ docker run -it --net macvlan --ip 192.168.0.5 --name dog-db redis

When dog-web communicates with dog-db, the physical network will route or switch the packet using the source and destination addresses of the containers. This can simplify network visibility as the packet headers can be linked directly to specific containers. At the same time application portability is decreased as container IPAM is tied to the physical network. Container addressing must adhere to the physical location of container placement in addition to preventing overlapping address assignment. Because of this, care must be taken to manage IPAM externally to a MACVLAN network. Overlapping IP addressing or incorrect subnets can lead to loss of container connectivity.

MACVLAN Benefits and Use Cases

  • Very low latency applications can benefit from the macvlan driver because it does not utilize NAT.
  • MACVLAN can provide an IP per container, which may be a requirement in some environments.
  • More careful consideration for IPAM must be taken in to account.