Hi Cube

Docker Networking: macvlans with VLANs

If you have read my introduction to macvlans and tried the basic macvlan bridge mode network configuration you are aware that a single Docker host network interface can serve as a parent interface to one macvlan or ipvlan network only.

One macvlan, one Layer 2 domain and one subnet per physical interface, however, is a rather serious limitation in a modern virtualization solution. Fortunately, a Docker host sub-interface can serve as a parent interface for the macvlan network. This aligns perfectly with the Linux implementation of VLANs, where each VLAN on a 802.1Q trunk connection is terminated on a sub-interface of the physical interface. You can map each Docker host interface to a macvlan network, thus extending the Layer 2 domain from the VLAN into the macvlan network.

Multiple macvlans with VLANs configuration

Docker Macvlan Bridge on VLAN 802.1Q trunk

You have a Docker host with a single eth0 interface connected to a router. Connection between the router and the Docker host is configured as 802.1Q trunk on the router with VLAN 10 and VLAN 20.

Configure VLAN 10 and VLAN 20 on your router. Add the following IP addresses to the Layer 3 interface: 10.0.10.1/24 and 2001:db8:babe:10::1/64 for VLAN 10, 10.0.20.1/24 and 2001:db8:babe:20::1/64 for VLAN 20.

Here’s the configuration if you happen to have a Cisco IOS router…

router(config)# interface fastEthernet 0/0
router(config-if)# no shutdown

router(config)# interface fastEthernet 0/0.10
router(config-subif)# encapsulation dot1Q 10
router(config-subif)# ip address 10.0.10.1 255.255.255.0
router(config-subif)# ipv6 address 2001:db8:babe:10::1/64

router(config)# interface fastEthernet 0/0.20
router(config-subif)# encapsulation dot1Q 20
router(config-subif)# ip address 10.0.20.1 255.255.255.0
router(config-subif)# ipv6 address 2001:db8:babe:20::1/64

…or Cisco Layer 3 Switch…

switch# configure terminal
switch(config)# vlan 10
switch(config)# vlan 20

switch(config)# interface fastEthernet0/0
switch(config-if)# switchport mode trunk
switch(config-if)# switchport trunk native vlan 1

switch(config)# interface vlan 10
switch(config-if)# ip address 10.0.10.1 255.255.255.0
switch(config-if)# ipv6 address 2001:db8:babe:10::1/64

switch(config)# interface vlan 20
switch(config-if)# ip address 10.0.20.1 255.255.255.0
switch(config-if)# ipv6 address 2001:db8:babe:20::1/64

… else I’m sure you know how to configure it on your router.

You will spin up three containers. You need to connect container0 to VLAN 10 and container2 to VLAN 20. container1 will have two interfaces, one in each VLAN. All containers are dual-stack, running both IPv4 and IPv6.

You might have .1Q sub interfaces already configured on your Docker host. That is OK. Docker will use your existent VLAN configuration, just make sure your sub interface numbers match the VLAN tags. If you are not familiar with Linux .1Q configuration, there is no need to learn about it now – it will be taken care of by Docker’s macvlan driver.

Verify which interfaces are available on the Docker host:

# ip addr | grep mtu
[...] 2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 [...]

Create two macvlan networks, one for each VLAN:

docker network create -d macvlan 
    --subnet=10.0.10.0/24 --gateway=10.0.10.1 
    --subnet=2001:db8:babe:10::/64 --gateway=2001:db8:babe:10::1 
    -o parent=eth0.10
    --ipv6 
    macvlan10
docker network create -d macvlan --subnet=10.0.20.0/24 --gateway=10.0.20.1 --subnet=2001:db8:babe:20::/64 --gateway=2001:db8:babe:20::1 -o parent=eth0.20 --ipv6 macvlan20

macvlan10 network uses sub interface eth0.10 as a parent. macvlan20 network uses sub interface eth0.20.

Docker macvlan driver automagically creates host sub interfaces when you create a new macvlan network with sub interface as a parent. Notice that sub interface number matches the VLAN tag/id:

# ip -d addr show | grep 'mtu\|vlan'
[...] 29: eth0.10@eth0: mtu 1500 qdisc noqueue state UP group default vlan protocol 802.1Q id 10 33: eth0.20@eth0: mtu 1500 qdisc noqueue state UP group default vlan protocol 802.1Q id 20 [...]

Verify that the macvlan networks were created:

# docker network ls
NETWORK ID NAME DRIVER 7fca4eb8c647 bridge bridge 9f904ee27bf5 none null cf03ee007fb4 host host 0a8dff61d189 macvlan10 macvlan 06c603d82b7d macvlan20 macvlan

Inspect one of the newly created macvlan networks. Notice the parent interface and IP configurations:

# docker network inspect macvlan10
[ { "Name": "macvlan10", "Id": "0a8dff61d18965158f4ef7fd519b06494d551b592479dec00501336d91b64af4", "Scope": "local", "Driver": "macvlan", "EnableIPv6": true, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "10.0.10.0/24", "Gateway": "10.0.10.1" }, { "Subnet": "2001:db8:babe:10::/64", "Gateway": "2001:db8:babe:10::1" } ] }, "Internal": false, "Containers": {}, "Options": { "parent": "eth0.10" }, "Labels": {} } ]

Spin up container0 and container2, each connected to its own macvlan network. Select the image of your choice or just use phusion/baseimage for the purpose of this tutorial:

docker run \
  --name='container0' \
  --hostname='container0' \
  --net=macvlan10 \
  --ip=10.0.10.2 \
  --ip6=2001:db8:babe:10::2 \
  --detach=true \
  phusion/baseimage:latest
docker run \ --name='container2' \ --hostname='container2' \ --net=macvlan20 \ --ip=10.0.20.4 \ --ip6=2001:db8:babe:20::4 \ --detach=true \ phusion/baseimage:latest

You don’t need to configure the IPv4 and IPv6 address – Docker’s IPAM driver can assign an IP address from the network’s subnet automatically. In this case you should configure the IP address only so your containers get the same IP addresses as described here.

Remember: Docker controls the IP address assignment for network and endpoint interfaces via the IPAM driver(s). Libnetwork has a default, built-in IPAM driver and allows third party IPAM drivers to be dynamically plugged. On network creation, the user can specify which IPAM driver libnetwork needs to use for the network’s IP address management. For the time being, there is no IPAM driver that would communicate with external DHCP server, so you need to rely on Docker’s default IPAM driver for container IP address and settings configuration.

Next, container1 needs to have 2 interfaces, one in each VLAN. As of Docker 1.12 it is not possible to specify two --net= parameters in the docker run command, so lets start by connecting the container to macvlan10 network:

docker run \
  --name='container1' \
  --hostname='container1' \
  --net=macvlan10 \
  --ip=10.0.10.3 \
  --ip6=2001:db8:babe:10::3 \
  --detach=true \
  phusion/baseimage:latest

Next, add the second interface to container1, connected to the macvlan20 network:

docker network connect \
  --ip=10.0.20.3 \
  --ip6=2001:db8:babe:20::3 \
  macvlan20 \
  container1

Verify that the three containers are running:

# docker ps
CONTAINER ID IMAGE COMMAND STATUS NAMES db1151ada129 phusion/baseimage:latest "/sbin/my_init" Up About a minute container1 d4ac80b7f3f7 phusion/baseimage:latest "/sbin/my_init" Up 3 minutes container2 8ead470a825b phusion/baseimage:latest "/sbin/my_init" Up 3 minutes container0

Verify the network configuration of container2:

# docker inspect container1
[...] "Networks": { "macvlan10": { "IPAMConfig": null, "Links": null, "Aliases": [ "db1151ada129" ], "NetworkID": "0a8dff61d18965158f4ef7fd519b06494d551b592479dec00501336d91b64af4", "EndpointID": "d43ab3307f73d9377288b3b9bc6ebe208bb87bd1b30cfae7937b7be12e8ddc54", "Gateway": "10.0.10.1", "IPAddress": "10.0.10.3", "IPPrefixLen": 24, "IPv6Gateway": "2001:db8:babe:10::1", "GlobalIPv6Address": "2001:db8:babe:10::3", "GlobalIPv6PrefixLen": 64, "MacAddress": "02:42:0a:0a:3c:04" }, "macvlan20": { "IPAMConfig": {}, "Links": null, "Aliases": [ "db1151ada129" ], "NetworkID": "06c603d82b7d821b3aaa240e2c00d309ee3fb420d644591a483b067270fbb7ef", "EndpointID": "c341ceec2fa5c2087f784811c8e75b186d847cb2b9ce4616c70b78e22024e46f", "Gateway": "10.0.20.1", "IPAddress": "10.0.20.3", "IPPrefixLen": 24, "IPv6Gateway": "2001:db8:babe:20::1", "GlobalIPv6Address": "2001:db8:babe:20::3", "GlobalIPv6PrefixLen": 64, "MacAddress": "02:42:0a:0a:46:03" } } [...]

Verify the interface configuration in container1:

# docker exec -ti container1 ip addr | grep 'mtu\|inet'
[...] 40: eth0@if29: mtu 1500 qdisc noqueue state UNKNOWN group default inet 10.0.10.3/24 scope global eth0 inet6 2001:db8:babe:10::3/64 scope global nodad inet6 fe80::42:aff:fe0a:3c04/64 scope link 41: eth1@if33: mtu 1500 qdisc noqueue state UNKNOWN group default inet 10.0.20.3/24 scope global eth1 inet6 2001:db8:babe:20::3/64 scope global nodad inet6 fe80::42:aff:fe0a:4603/64 scope link

Note the two interfaces that are placed in two different VLANs with respective IP addresses. The @if29 and @if33 interface name parts are references to Docker host’s parent interface indexes 29 and 33 and likely differ on your end:

# docker exec -ti container1 ip addr | grep mtu
[...] 29: eth0.10@eth0: mtu 1500 qdisc noqueue state UP group default 33: eth0.20@eth0: mtu 1500 qdisc noqueue state UP group default

Check the IP route inside container1:

# docker exec -ti container1 ip route
default via 10.0.10.1 dev eth0 10.0.10.0/24 dev eth0 proto kernel scope link src 10.0.10.3 10.0.20.0/24 dev eth1 proto kernel scope link src 10.0.20.3
# docker exec -ti container1 ip -6 route
2001:db8:babe:10::/64 dev eth0 proto kernel metric 256 2001:db8:babe:20::/64 dev eth1 proto kernel metric 256 fe80::/64 dev eth0 proto kernel metric 256 fe80::/64 dev eth1 proto kernel metric 256 default via 2001:db8:babe:10::1 dev eth0 metric 1024

Finally, check connectivity with the other 2 containers from container1:

# docker exec -ti container1 ping -c 4 10.0.10.2
PING 10.0.10.2 (10.0.10.2) 56(84) bytes of data. 64 bytes from 10.0.10.2: icmp_seq=1 ttl=64 time=0.060 ms 64 bytes from 10.0.10.2: icmp_seq=2 ttl=64 time=0.028 ms 64 bytes from 10.0.10.2: icmp_seq=3 ttl=64 time=0.048 ms 64 bytes from 10.0.10.2: icmp_seq=4 ttl=64 time=0.039 ms --- 10.0.10.2 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3001ms rtt min/avg/max/mdev = 0.028/0.043/0.060/0.014 ms
# docker exec -ti container1 ping -c 4 10.0.20.4
PING 10.0.20.4 (10.0.20.4) 56(84) bytes of data. 64 bytes from 10.0.20.4: icmp_seq=1 ttl=64 time=0.061 ms 64 bytes from 10.0.20.4: icmp_seq=2 ttl=64 time=0.027 ms 64 bytes from 10.0.20.4: icmp_seq=3 ttl=64 time=0.034 ms 64 bytes from 10.0.20.4: icmp_seq=4 ttl=64 time=0.036 ms --- 10.0.20.4 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 2997ms rtt min/avg/max/mdev = 0.027/0.039/0.061/0.014 ms

And with the router:

# docker exec -ti container1 ping -c 4 10.0.10.1
PING 10.0.10.1 (10.0.10.1) 56(84) bytes of data. 64 bytes from 10.0.10.1: icmp_seq=1 ttl=64 time=0.503 ms 64 bytes from 10.0.10.1: icmp_seq=2 ttl=64 time=0.209 ms 64 bytes from 10.0.10.1: icmp_seq=3 ttl=64 time=0.209 ms 64 bytes from 10.0.10.1: icmp_seq=4 ttl=64 time=0.218 ms --- 10.0.10.1 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 2997ms rtt min/avg/max/mdev = 0.209/0.284/0.503/0.127 ms
# docker exec -ti container1 ping -c 4 10.0.20.1
PING 10.0.20.1 (10.0.20.1) 56(84) bytes of data. 64 bytes from 10.0.20.1: icmp_seq=1 ttl=64 time=0.486 ms 64 bytes from 10.0.20.1: icmp_seq=2 ttl=64 time=0.226 ms 64 bytes from 10.0.20.1: icmp_seq=3 ttl=64 time=0.228 ms 64 bytes from 10.0.20.1: icmp_seq=4 ttl=64 time=0.227 ms --- 10.0.20.1 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.226/0.291/0.486/0.114 ms

Let’s do a quick traceroute from container0 to remaining containers to see the path packets follow. Ensure traceroute is installed:

# docker exec -ti container0 apt-get update
# docker exec -ti container0 apt-get install traceroute

Now perform the traceroutes:

# docker exec -ti container0 traceroute 10.0.10.3
traceroute to 10.0.10.3 (10.0.10.3), 30 hops max, 60 byte packets 1 container1.macvlan10 (10.0.10.3) 0.045 ms 0.012 ms 0.008 ms
# docker exec -ti container0 traceroute 10.0.20.3
traceroute to 10.0.20.3 (10.0.20.3), 30 hops max, 60 byte packets 1 10.0.10.1 (10.0.10.1) 0.229 ms 0.279 ms 0.310 ms 2 10.0.20.3 (10.0.20.3) 0.725 ms 0.754 ms 0.743 ms
# docker exec -ti container0 traceroute 10.0.20.4
traceroute to 10.0.20.4 (10.0.20.4), 30 hops max, 60 byte packets 1 10.0.10.1 (10.0.10.1) 0.265 ms 0.261 ms 0.323 ms 2 10.0.20.4 (10.0.20.4) 0.725 ms 0.756 ms 0.745 ms

As expected, there is direct, one hop connectivity between container0 and container1‘s macvlan10 interface. Packets towards the other two interfaces in macvlan20 need to leave the docker host through the trunk link to the router and are routed back through the trunk into macvlan20, hence router’s 10.0.10.1 IP address in the traceroute. If the last two traceroutes fail, verify your router’s configuration.

Docker Macvlan Bridge on VLAN 802.1Q trunk

Finally, check the ARP table on the router / L3 switch. After all the pings performed, it should have the entries for all 4 container IP addresses (mapped to containers’ virtual MAC addresses) each in its respective VLAN:

l3switch# show ip arp 
Protocol Address Age (min) Hardware Addr Type Interface Internet 10.0.10.2 9 0242.0a0a.2602 ARPA Vlan10 Internet 10.0.10.3 4 0242.0a0a.3c04 ARPA Vlan10 Internet 10.0.20.3 3 0242.0a0a.4603 ARPA Vlan20 Internet 10.0.20.4 1 0242.0a0a.5c04 ARPA Vlan20

Congratulations! You have just connected your 3 containers to two separate VLAN networks using the macvlan network mapping to a 802.1Q trunk sub interface.

2 comments for “Docker Networking: macvlans with VLANs

  1. November 16, 2016 at 16:32

    you are genius. very useful for me. many thank

  2. asa666
    December 31, 2016 at 08:14

    This was very useful. Thanks so much. I got so close, but when I run “ip addr” inside the container I see

    47: eth1@if2: mtu 1500 qdisc noqueue state UNKNOWN group default

    I can’t seem to get eth1 (the macvlan NIC) up. I can’t ping the gateway from the container and it doesnt appear in the router’s ARP table. 🙁

Leave a Reply