談談 Docker 網絡

mozillazg 發(fā)布于2019-06-28 16:31 / 560人閱讀

摘要：基于近期學習的內容，整理與網絡相關的知識。針對這一問題，采用網絡來解決。但這篇博客的重點不在，我們可以在啟動時，為其指定一個分布式存儲，從而使得我們能夠實驗網絡。

基于近期學習的 Docker 內容，整理與 Docker 網絡相關的知識。

實驗環(huán)境：Centos 7.4

Docker 版本如下：

Client:
 Version:      18.03.1-ce
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   9ee9f40
 Built:        Thu Apr 26 07:20:16 2018
 OS/Arch:      linux/amd64
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.03.1-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.5
  Git commit:   9ee9f40
  Built:        Thu Apr 26 07:23:58 2018
  OS/Arch:      linux/amd64
  Experimental: false

1. Linux 網絡命名空間與 Veth 1.1 Linux 網絡命名空間

以下摘自 linux 網絡命名空間 Network namespaces - CSDN博客：

Linux命名空間是一個相對較新的內核功能，對于實現容器至關重要。 命名空間將全局系統資源包裝到一個抽象中，該抽象只會與命名空間中的進程綁定，從而提供資源隔離。Linux內核提供了6種類型的命名空間：pid，net，mnt，uts，ipc和user。網絡命名空間為命名空間中的所有進程提供了全新的網絡堆棧。 這包括網絡接口，路由表和iptables規(guī)則。

我們可以使用：

ip netns list

查看當前系統中存在的網絡空間

或者通過以下指令增加或刪除：

# 增加
ip netns add ns1

# 刪除
ip netns delete ns1

這里，我們使用如下指令，查看 ns1 網絡命名空間中網絡情況：

[root@localhost vagrant]# ip netns exec ns1 ip a
1: lo:  mtu 65536 qdisc noop state DOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

可以看到，在新創(chuàng)建的 ns1 中，僅存在一個處于 DOWN 狀態(tài)的本地回環(huán)端口。

此時，我們使用以下指令執(zhí)行時，會發(fā)現網絡不通：

[root@localhost vagrant]# ip netns exec ns1 ping localhost -c 4
connect: 網絡不可達

此時我們需要借助 ip link，將 lo 端口 UP 起來：

ip netns exec ns1 set dev lo up

此時，對本地端口的 ping 請求就可達了：

[root@localhost vagrant]# ip netns exec ns1 ping localhost -c 4
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.031 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.060 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.070 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.036 ms

--- localhost ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 0.031/0.049/0.070/0.017 ms

注：ip link 指令可以查到本機網絡接口情況，這一指令在 Veth 中還會提到。

1.2 Veth

關于 Veth 的特點，以下摘自：Linux虛擬網絡設備之veth - Linux程序員 - SegmentFault 思否：

veth和其它的網絡設備都一樣，一端連接的是內核協議棧。
veth設備是成對出現的，另一端兩個設備彼此相連
一個設備收到協議棧的數據發(fā)送請求后，會將數據發(fā)送到另一個設備上去。

這里，我們再新建一個 ns2 的網絡命名空間，并通過 Veth，實現 ns1 和 ns2 之間的連通，步驟如下：

1.2.1 準備 ns1 和 ns2 網絡命名空間

首先，我們通過：

ip netns add ns2

創(chuàng)建 ns2 網絡命名空間。

注：也可仿照上述 ns1 的做法，將 ns2 中 lo 置于 UP 狀態(tài)。

1.2.2 創(chuàng)建 Veth pair

下面，我們執(zhí)行如下指令，創(chuàng)建 Veth pair：

ip link add veth-ns1 type veth peer name veth-ns2

之后，我們可以通過 ip link 指令，看到接口列表中新增了我們剛剛添加的那個 veth pair：

[root@localhost vagrant]# ip link

...

4: veth-ns2@veth-ns1:  mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
    link/ether 8e:3c:d3:98:29:9d brd ff:ff:ff:ff:ff:ff
5: veth-ns1@veth-ns2:  mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000
    link/ether 92:8a:2f:d6:e0:72 brd ff:ff:ff:ff:ff:ff

1.2.3 將 Veth pair 兩端分別加入到 ns1 和 ns2

執(zhí)行以下指令：

# 將 Veth pair 的 veth-ns1 添加到 ns1
ip link set veth-ns1 netns ns1

# 將 Veth pair 的 veth-ns2 添加到 ns2
ip link set veth-ns2 netns ns2

此時，我們再次執(zhí)行 ip link，會發(fā)現，之前創(chuàng)建 Veth pair 的兩端已經分別存在于 ns1 和 ns2 中：

ip netns exec ns1 ip link

ip netns exec ns2 ip link

1.2.4 為 Veth pair 的兩端分別配置 IP 地址

執(zhí)行以下指令：

ip netns exec ns1 ip addr add 192.168.1.1/24 dev veth-ns1

ip netns exec ns2 ip addr add 192.168.1.2/24 dev veth-ns2

注意，這一步一定要在 Veth pair 的兩端加入到 ns1、ns2 之后進行。

1.2.5 將 Veth pair 的兩端置為 UP 狀態(tài)

執(zhí)行以下指令：

ip netns exec ns1 ip link set dev veth-ns1 up

ip netns exec ns2 ip link set dev veth-ns2 up

此時，我們便可以通過 ip a 看到目前該網絡空間的情況了：

###### ns1 ######

[root@localhost vagrant]# ip netns exec ns1 ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
5: veth-ns1@if4:  mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 92:8a:2f:d6:e0:72 brd ff:ff:ff:ff:ff:ff link-netnsid 1
    inet 192.168.1.1/24 scope global veth-ns1
       valid_lft forever preferred_lft forever
    inet6 fe80::908a:2fff:fed6:e072/64 scope link
       valid_lft forever preferred_lft forever
       

###### ns1 ######

[root@localhost vagrant]# ip netns exec ns2 ip a
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
4: veth-ns2@if5:  mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 8e:3c:d3:98:29:9d brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.1.2/24 scope global veth-ns2
       valid_lft forever preferred_lft forever
    inet6 fe80::8c3c:d3ff:fe98:299d/64 scope link
       valid_lft forever preferred_lft forever

1.2.6 檢驗連通

通過兩端互 ping 的方式，檢測連通性：

ns1 -> ns2

[root@localhost vagrant]# ip netns exec ns1 ping 192.168.1.2 -c 4
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from 192.168.1.2: icmp_seq=2 ttl=64 time=0.040 ms
64 bytes from 192.168.1.2: icmp_seq=3 ttl=64 time=0.046 ms
64 bytes from 192.168.1.2: icmp_seq=4 ttl=64 time=0.044 ms

--- 192.168.1.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.037/0.041/0.046/0.008 ms

ns2 -> ns1

[root@localhost vagrant]# ip netns exec ns2 ping 192.168.1.1 -c 4
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.030 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.038 ms
64 bytes from 192.168.1.1: icmp_seq=3 ttl=64 time=0.068 ms
64 bytes from 192.168.1.1: icmp_seq=4 ttl=64 time=0.067 ms

--- 192.168.1.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3006ms
rtt min/avg/max/mdev = 0.030/0.050/0.068/0.019 ms

1.2.7 總結

以上過程的核心步驟示意如下：

2. Docker bridge 型網絡與 bridge0 2.1 bridge 連通實驗

執(zhí)行 ip link，我們可以看到：

[root@localhost vagrant]# ip link

...

3: docker0:  mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 02:42:7f:68:8e:01 brd ff:ff:ff:ff:ff:ff

這里存在一個 docker0，這在 Docker bridge 型網絡中起到至關重要的作用。

注意：根據 Docker 官方文檔：Networking features in Docker for Mac | Docker Documentation，在 Mac OS 中是沒有 bridge0 的。推薦使用 Centos 系統進行試驗。

接著，我們使用如下指令創(chuàng)建一個 Container，用以觀察容器 bridge 型網絡的實現方式：

docker run -d --name box1 busybox /bin/sh -c "while true; do sleep 3600; done;"

注：

使用一個死循環(huán)指令可以使得容器不會立刻停止；

Docker 容器默認的網絡為 bridge

然后執(zhí)行 ip link 觀察變化，可以發(fā)現，列表中多出了如下部分：

[root@localhost vagrant]# ip link

...

11: veth4f19367@if10:  mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 42:01:42:6c:52:5a brd ff:ff:ff:ff:ff:ff link-netnsid 0

接著，為了觀察網絡橋接情況，我們需要安裝一個軟件：

yum install bridge-utils

然后使用：

[root@localhost vagrant]# brctl show
bridge name    bridge id        STP enabled    interfaces
docker0        8000.02427f688e01    no        veth4f19367

可以看到，創(chuàng)建 box1 容器后，新增的 Veth 通過橋接連接到了 docker0 上，使得容器和宿主機之間可以互相連通。

為了檢測連通性，首先我們執(zhí)行：

[root@localhost vagrant]# ip a

...

3: docker0:  mtu 1500 qdisc noqueue state UP
    link/ether 02:42:7f:68:8e:01 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:7fff:fe68:8e01/64 scope link
       valid_lft forever preferred_lft forever

...

得到宿主機的 IP 為 172.17.0.1/16

然后通過：

[root@localhost vagrant]# docker exec box1 ip a

...

10: eth0@if11:  mtu 1500 qdisc noqueue
    link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
       valid_lft forever preferred_lft forever

得到 box1 這一容器的 IP 為 172.17.0.2/16

之后，便可以通過以下方式檢測二者互相的連通性：

宿主機（ 172.17.0.1 ）-> 容器（ 172.17.0.2 ）

[root@localhost vagrant]# ping 172.17.0.2 -c 4
PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.
64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.049 ms
64 bytes from 172.17.0.2: icmp_seq=2 ttl=64 time=0.054 ms
64 bytes from 172.17.0.2: icmp_seq=3 ttl=64 time=0.057 ms
64 bytes from 172.17.0.2: icmp_seq=4 ttl=64 time=0.082 ms

--- 172.17.0.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3001ms
rtt min/avg/max/mdev = 0.049/0.060/0.082/0.014 ms

容器（ 172.17.0.2 ）-> 宿主機（ 172.17.0.1 ）

[root@localhost vagrant]# docker exec box1 ping 172.17.0.1 -c 4
PING 172.17.0.1 (172.17.0.1): 56 data bytes
64 bytes from 172.17.0.1: seq=0 ttl=64 time=0.068 ms
64 bytes from 172.17.0.1: seq=1 ttl=64 time=0.258 ms
64 bytes from 172.17.0.1: seq=2 ttl=64 time=0.084 ms
64 bytes from 172.17.0.1: seq=3 ttl=64 time=0.087 ms

--- 172.17.0.1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.068/0.124/0.258 ms

同理，我們可以再創(chuàng)建一個 box2 容器，檢測 box1 與 box2 之間的連通性。由于方法相同，這里簡述如下：

# 創(chuàng)建 box2 容器
docker run -d --name box2 busybox /bin/sh -c "while true; do sleep 3600; done;"

# 【省略】通過 ip a 分別查詢 box1 和 box2 的 ip

# 連通測試 box1 -> box2
docker exec box1 ping 172.17.0.3 -c 4

# 連通測試 box2 -> box1
docker exec box2 ping 172.17.0.2 -c 4

2.2 總結

如上，Docker 中實現 bridge 型網絡互連的方式如下圖：

此外，bridge 型網絡中，容器與公網連通的方式如下：

注：bridge 指的是網絡類型，bridge0 是接口名。

3. 以容器名進行網絡連接

在 Docker 容器中，以 IP 方式進行彼此連通，會使得很多業(yè)務場景首先，我們更期望使用域名或者別名進行網絡訪問，從而在 IP 地址變化時，依然保證系統各模塊的穩(wěn)定（每次創(chuàng)建容器，其 IP 地址不是恒定的）。

由此，我們可以使用如下方式實現這一功能。

首先，我們需要將先前啟動的容器停止并刪除：

docker rm -f $(docker ps -aq)

注：這一指令會停止并刪除所有容器。

3.1 Docker link

此方法旨在使用 docker 啟動時的 --link 參數，步驟如下：

3.1.1 準備 box1、box2 兩個容器

docker run -d --name box1 busybox /bin/sh -c "while true; do sleep 3600; done;"

# 注意，這里添加了 --link box1 參數

docker run -d --name box2 --link box1 busybox /bin/sh -c "while true; do sleep 3600; done;"

3.1.2 連通測試——以 IP

box1 -> box2

[root@localhost vagrant]# docker exec box1 ping 172.17.0.3 -c 4
PING 172.17.0.3 (172.17.0.3): 56 data bytes
64 bytes from 172.17.0.3: seq=0 ttl=64 time=0.071 ms
64 bytes from 172.17.0.3: seq=1 ttl=64 time=0.158 ms
64 bytes from 172.17.0.3: seq=2 ttl=64 time=0.110 ms
64 bytes from 172.17.0.3: seq=3 ttl=64 time=0.146 ms

--- 172.17.0.3 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.071/0.121/0.158 ms

box2 -> box1

[root@localhost vagrant]# docker exec box2 ping 172.17.0.2 -c 4
PING 172.17.0.2 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.081 ms
64 bytes from 172.17.0.2: seq=1 ttl=64 time=0.082 ms
64 bytes from 172.17.0.2: seq=2 ttl=64 time=0.091 ms
64 bytes from 172.17.0.2: seq=3 ttl=64 time=0.109 ms

--- 172.17.0.2 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.081/0.090/0.109 ms

3.1.3 連通測試——以容器名

box1 -> box2

[root@localhost vagrant]# docker exec box1 ping box2 -c 4
ping: bad address "box2"

box2 -> box1

[root@localhost vagrant]# docker exec box2 ping box1 -c 4
PING box1 (172.17.0.2): 56 data bytes
64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.073 ms
64 bytes from 172.17.0.2: seq=1 ttl=64 time=0.106 ms
64 bytes from 172.17.0.2: seq=2 ttl=64 time=0.173 ms
64 bytes from 172.17.0.2: seq=3 ttl=64 time=0.078 ms

--- box1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.073/0.107/0.173 ms

3.2 自定義 bridge 類型網絡

對 Docker 而言，我們可以新建一個 bridge 類型的網絡，并將需要以彼此的容器名互聯的容器添加到這一網絡中，即可實現上述需求。注意，一定要使用新建的 bridge 網絡，采用默認的那個是不行的。

為了進行試驗，我們需要先將 3.1 中創(chuàng)建的容器刪除，然后進行如下步驟：

3.2.1 新建 bridge 類型網絡

執(zhí)行如下指令，創(chuàng)建 bridge 類型網絡：

docker network create my-bridge --driver bridge

3.2.2 創(chuàng)建容器

這里，我們創(chuàng)建三個容器：box1、box2、box3，并將 box1 和 box2 加入新創(chuàng)建的 my-bridge 網絡中：

docker run -d --name box1 --network my-bridge busybox /bin/sh -c "while true; do sleep 3600; done;"

docker run -d --name box2 --network my-bridge busybox /bin/sh -c "while true; do sleep 3600; done;"

docker run -d --name box3 busybox /bin/sh -c "while true; do sleep 3600; done;"

然后，我們同一下指令，檢測容器是否存在于網絡中：

檢查 my-bridge 網絡中有哪些容器：

# 指令
docker network inspect my-bridge

# 結果
"Containers": {
    "70b929cee25b665ff0d55cd4e979fcf8dd21190c6200d49af0f2bef07efb723e": {
        "Name": "box2",
        "EndpointID": "6c19d556751f3e745f947cc0b0e2995312b6712055fac4d79a52114b6f21ffffd2",
        "MacAddress": "02:42:ac:12:00:03",
        "IPv4Address": "172.18.0.3/16",
        "IPv6Address": ""
    },
    "92d63195660f1e666028294144ffffd65be06e38827debaad1ebd90ce9c79dbaa7": {
        "Name": "box1",
        "EndpointID": "9b28b2e724ff19e1cbce1fb05afe4a439f51735c0a2a7671c8002fa58c087199",
        "MacAddress": "02:42:ac:12:00:02",
        "IPv4Address": "172.18.0.2/16",
        "IPv6Address": ""
    }
}

檢查 bridge 網絡中有哪些容器：

# 指令
docker network inspect bridge

# 結果
"Containers": {
    "5526023acd4197c9abf6f17c9a59585322cd2154b70238872083e48b7012ab4e": {
        "Name": "box3",
        "EndpointID": "6ceb455b487e533439c715a48849986a62ecda68d51d141f228ebdb7d06cf560",
        "MacAddress": "02:42:ac:11:00:02",
        "IPv4Address": "172.17.0.2/16",
        "IPv6Address": ""
    }
}

3.2.3 測試

首先，我們對處于 my-bridge 這一自定義網絡中的容器 box1、box2 之間的連通性進行測試（通過容器名）：

[root@localhost vagrant]# docker exec box1 ping box2 -c 1
PING box2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: seq=0 ttl=64 time=0.052 ms

--- box2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.052/0.052/0.052 ms
[root@localhost vagrant]# docker exec box2 ping box1 -c 1
PING box1 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: seq=0 ttl=64 time=0.053 ms

--- box1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.053/0.053/0.053 ms

然后我們測試 box3 與 box1、box2 之間的連通性（通過容器名）：

[root@localhost vagrant]# docker exec box1 ping box3 -c 1
ping: bad address "box3"
[root@localhost vagrant]# docker exec box2 ping box3 -c 1
ping: bad address "box3"
[root@localhost vagrant]# docker exec box3 ping box1 -c 1
ping: bad address "box1"
[root@localhost vagrant]# docker exec box3 ping box2 -c 1
ping: bad address "box2"

然后，我們測試一下這三個容器對公網的連通性：

[root@localhost vagrant]# docker exec box1 ping www.baidu.com -c 1
PING www.baidu.com (115.239.210.27): 56 data bytes
64 bytes from 115.239.210.27: seq=0 ttl=61 time=22.467 ms

--- www.baidu.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 22.467/22.467/22.467 ms
[root@localhost vagrant]# docker exec box2 ping www.baidu.com -c 1
PING www.baidu.com (115.239.210.27): 56 data bytes
64 bytes from 115.239.210.27: seq=0 ttl=61 time=19.068 ms

--- www.baidu.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 19.068/19.068/19.068 ms
[root@localhost vagrant]# docker exec box3 ping www.baidu.com -c 1
PING www.baidu.com (115.239.211.112): 56 data bytes
64 bytes from 115.239.211.112: seq=0 ttl=61 time=20.087 ms

--- www.baidu.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 20.087/20.087/20.087 ms

可以發(fā)現：

box1 和 box2 之間是相互連通的（通過容器名）；

box3 之于 box1、box2 都是不連通的（通過容器名）；

這三個容器對公網都是連通的。

3.3 Docker network connect

對于已經啟動的容器，可以通過 docker network connect 將其加入某一網絡中。

接著 3.2 中的試驗，我們將 box3 容器加入到 my-bridge 網絡中，然后重新檢測其對 box1、box2 的連通性：

3.3.1 將 box3 加入 my-bridge 網絡中

docker network connect my-bridge box3

3.3.2 檢驗 my-bridge 網絡中現有容器情況

# 指令
docker network inspect my-bridge

# 結果
"Containers": {
    "5526023acd4197c9abf6f17c9a59585322cd2154b70238872083e48b7012ab4e": {
        "Name": "box3",
        "EndpointID": "b7fe8d2697ab20ec5a92886768e33345dd7d161488ac30488e8800e4b0fb0166",
        "MacAddress": "02:42:ac:12:00:04",
        "IPv4Address": "172.18.0.4/16",
        "IPv6Address": ""
    },
    "70b929cee25b665ff0d55cd4e979fcf8dd21190c6200d49af0f2bef07efb723e": {
        "Name": "box2",
        "EndpointID": "6c19d556751f3e745f947cc0b0e2995312b6712055fac4d79a52114b6f21ffffd2",
        "MacAddress": "02:42:ac:12:00:03",
        "IPv4Address": "172.18.0.3/16",
        "IPv6Address": ""
    },
    "92d63195660f1e666028294144ffffd65be06e38827debaad1ebd90ce9c79dbaa7": {
        "Name": "box1",
        "EndpointID": "9b28b2e724ff19e1cbce1fb05afe4a439f51735c0a2a7671c8002fa58c087199",
        "MacAddress": "02:42:ac:12:00:02",
        "IPv4Address": "172.18.0.2/16",
        "IPv6Address": ""
    }
}

可以發(fā)現，box3 已經加入到了 my-bridge 網絡之中。

3.3.3 檢驗 box3 對 box1、box2 之間的連通性（通過容器名）

[root@localhost vagrant]# docker exec box3 ping box1
PING box1 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: seq=0 ttl=64 time=0.075 ms
^C
[root@localhost vagrant]# docker exec box3 ping box1 -c 1
PING box1 (172.18.0.2): 56 data bytes
64 bytes from 172.18.0.2: seq=0 ttl=64 time=0.066 ms

--- box1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.066/0.066/0.066 ms
[root@localhost vagrant]# docker exec box3 ping box2 -c 1
PING box2 (172.18.0.3): 56 data bytes
64 bytes from 172.18.0.3: seq=0 ttl=64 time=0.069 ms

--- box2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.069/0.069/0.069 ms
[root@localhost vagrant]# docker exec box1 ping box3 -c 1
PING box3 (172.18.0.4): 56 data bytes
64 bytes from 172.18.0.4: seq=0 ttl=64 time=0.050 ms

--- box3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.050/0.050/0.050 ms
[root@localhost vagrant]# docker exec box2 ping box3 -c 1
PING box3 (172.18.0.4): 56 data bytes
64 bytes from 172.18.0.4: seq=0 ttl=64 time=0.052 ms

--- box3 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.052/0.052/0.052 ms

可以發(fā)現，現在 box1、box2、box3 之間可以互相通過容器名連通了。

4. Overlay 網絡實現多 Docker 節(jié)點中容器間通信 4.1 多 Docker 節(jié)點中容器間通信問題

在實際生產中，往往存在不同主機間 Docker 容器的互訪問題，如下圖所示：

由于 box_o_1 和 box_o_2 處于各自機器的 Docker 中，若想互相訪問，可以通過層層路由配置，使數據包可以在二者之間傳遞。

這一過程在管理和擴展上都比較麻煩。針對這一問題，Docker 采用 Overlay 網絡來解決。

4.2 Overlay 網絡與 VXLAN

簡而言之，Overlay 可以使得我們將報文在 IP 報文之上再次封裝。以 4.1 中的圖舉例，Overlay 使得數據可以先通過傳統網絡由左側節(jié)點到達右側節(jié)點，數據包在右側節(jié)點拆包后，其內容又是一個標準的網絡數據包，進而能夠傳遞到 box_o_2 容器中。

在 Overlay 網絡中，VXLAN 技術為其核心，以下內容摘自：Overlay之VXLAN架構：

VXLAN： VXLAN是將以太網報文封裝成UDP報文進行隧道傳輸，UDP目的端口為已知端口，源端口可按流分配，標準5元組方式有利于在IP網絡轉發(fā)過程中進行負載分擔；隔離標識采用24比特來表示；未知目的、廣播、組播等網絡流量均被封裝為組播轉發(fā)。

VXLAN 報文結構如下，下圖取自：QinQ vs VLAN vs VXLAN：

4.3 在 Docker 中使用 Overlay 4.3.1 實驗目的及前置準備

在這一實驗中，我們需要準備：

兩臺已安裝 Docker 的機器。這里的機器系統為：Centos7.4，機器 IP 分別為：192.168.33.10 及 192.168.33.11；

兩臺 Docker 機器之間彼此互通。

實驗的目的是實現位于兩臺機器上兩個容器彼此互通。

4.3.2 etcd

當我們在其中一臺機器上創(chuàng)建一個 overlay 類型的網絡時，會報錯如下：

[vagrant@192-168-33-10 ~]$ docker network create -d overlay my-overlay
Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.

這里是提示我們，單機模式時，是無法創(chuàng)建 overlay 類型網絡的。要想實現，這一需求，可以使用 Docker 編排工具，如 swarm。

但這篇博客的重點不在 swarm，我們可以在 docker 啟動時，為其指定一個分布式存儲，從而使得我們能夠實驗 overlay 網絡。

etcd 是一個分布式鍵值存儲系統，其 Github 地址為：GitHub - coreos/etcd: Distributed reliable key-value store for the most critical data of a distributed system。

關于 etcd 的安裝細節(jié)可以參考另一篇博文：分布式 key-value 存儲系統 etcd 的安裝備忘。這里直接貼出安裝指令：

# 在兩臺機器上下載、解壓 etcd，并進入 etcd 目錄

wget https://github.com/coreos/etcd/releases/download/v3.3.8/etcd-v3.3.8-linux-amd64.tar.gz
tar -zvxf etcd-v3.3.8-linux-amd64.tar.gz
cd etcd-v3.3.8-linux-amd64

# 在 192.168.33.10 中執(zhí)行

nohup ./etcd --name my-etcd-1 
--listen-client-urls http://192.168.33.10:2379 
--advertise-client-urls http://192.168.33.10:2379 
--listen-peer-urls http://192.168.33.10:2380 
--initial-advertise-peer-urls http://192.168.33.10:2380 
--initial-cluster my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380 
--initial-cluster-token my-etcd-token 
--initial-cluster-state new 
>/dev/null 2>&1 &

# 在 192.168.33.11 中執(zhí)行

nohup ./etcd --name my-etcd-2 
--listen-client-urls http://192.168.33.11:2379 
--advertise-client-urls http://192.168.33.11:2379 
--listen-peer-urls http://192.168.33.11:2380 
--initial-advertise-peer-urls http://192.168.33.11:2380 
--initial-cluster my-etcd-1=http://192.168.33.10:2380,my-etcd-2=http://192.168.33.11:2380 
--initial-cluster-token my-etcd-token 
--initial-cluster-state new 
>/dev/null 2>&1 &

最終可以通過以下方式檢測集群的啟動情況：

[vagrant@192-168-33-10 etcd-v3.3.8-linux-amd64]$ ./etcdctl --endpoints http://192.168.33.10:2379 cluster-health
member 42ab269b4f75b118 is healthy: got healthy result from http://192.168.33.11:2379
member 7118e8ab00eced36 is healthy: got healthy result from http://192.168.33.10:2379
cluster is healthy

注：建議 etcd 啟動后，重新 ssh 一次，以免出現 etcd 的一些莫名其妙的報錯信息。

4.3.3 借助 etcd 重啟 docker

接下來我們要借助 etcd，重啟 docker，并為其指定分布式存儲：

首先，在兩臺機器上停止 docker:

systemctl stop docker

然后，重啟 docker。注意，這一步需要在 etcd 啟動后進行：

# 在 192.168.33.10 中執(zhí)行：
sudo /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-store=etcd://192.168.33.10:2379 --cluster-advertise=192.168.33.10:2375 &

# 在 192.168.33.11 中執(zhí)行：
sudo /usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-store=etcd://192.168.33.11:2379 --cluster-advertise=192.168.33.11:2375 &

4.3.4 創(chuàng)建 overlay 網絡

接下來，我們再次創(chuàng)建 overlay 網絡

在 192.168.33.10 中執(zhí)行：

[vagrant@192-168-33-10 etcd-v3.3.8-linux-amd64]$ docker network create -d overlay my-overlay
989fbe070eed940a0c3ec4182e7d04413a16eb9e33547b0d88002a7ec5138a07

由于在 docker 啟動時指定了分布式存儲，因而我們無需再在 192.168.33.11 中創(chuàng)建這一 overlay 網絡，創(chuàng)建過程已經在節(jié)點之間同步了：

# 192.168.33.10 中的網絡創(chuàng)建行為已經在 192.168.33.11 中同步了

[vagrant@192-168-33-11 etcd-v3.3.8-linux-amd64]$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
16498c285808        bridge              bridge              local
b5cb83566f18        host                host                local
989fbe070eed        my-overlay          overlay             global
80fe629bd0c3        none                null                local

為了證明這一點，我們可以在 etcd 中看到這一存儲：

[vagrant@192-168-33-11 etcd-v3.3.8-linux-amd64]$ ./etcdctl --endpoints http://192.168.33.11:2379 ls /docker/network/v1.0/network
/docker/network/v1.0/network/989fbe070eed940a0c3ec4182e7d04413a16eb9e33547b0d88002a7ec5138a07

可以看到，這一網絡的 Hash 名正是我們在先前創(chuàng)建時所生成的名字。

注：不同的 docker 版本可能使得這一路徑不同。

4.3.5 創(chuàng)建容器，并使其加入 my-overlay 網絡

# 在 192.168.33.10 上啟動 box1
[vagrant@192-168-33-10 ~]$ docker run -d --name box1 --network my-overlay busybox /bin/sh -c "while true; do sleep 3600; done"

# 在 192.168.33.11 上啟動 box2
[vagrant@192-168-33-11 ~]$ docker run -d --name box2 --network my-overlay busybox /bin/sh -c "while true; do sleep 3600; done"

接著，我們使用以下指令，查看該網絡中的容器情況：

docker network inspect my-overlay

"Containers": {
    "46397d0f23812e4252858b8e28c76f6fe5ecc34a95389af0abb22473058cd930": {
        "Name": "box1",
        "EndpointID": "b9826147f5b9bdd9e504ecd5860caf85d52151da1c03d47a9627403c55099b2e",
        "MacAddress": "",
        "IPv4Address": "10.0.0.2/24",
        "IPv6Address": ""
    },
    "ep-740ffaf6b455a7f3214ac83d9452d2229f1b19b847c4fba478a1b3650af97927": {
        "Name": "box2",
        "EndpointID": "740ffaf6b455a7f3214ac83d9452d2229f1b19b847c4fba478a1b3650af97927",
        "MacAddress": "",
        "IPv4Address": "10.0.0.3/24",
        "IPv6Address": ""
    }
}

可以發(fā)現，及時處于不同的主機上，我們也在打印的信息中看到這兩個容器。此外，這兩個容器的 IP 并沒有重復，這也是借助分布式存儲系統起到的效果。

4.3.6 連接測試

192.168.33.10 - box1(10.0.0.2) -> 192.168.33.11 - box2(10.0.0.3)

[vagrant@192-168-33-10 ~]$ docker exec box1 ping box2 -c 4
PING box2 (10.0.0.3): 56 data bytes
64 bytes from 10.0.0.3: seq=0 ttl=64 time=0.785 ms
64 bytes from 10.0.0.3: seq=1 ttl=64 time=0.643 ms
64 bytes from 10.0.0.3: seq=2 ttl=64 time=0.601 ms
64 bytes from 10.0.0.3: seq=3 ttl=64 time=0.638 ms

--- box2 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.601/0.666/0.785 ms
[vagrant@192-168-33-10 ~]$ docker exec box1 ping 10.0.0.3 -c 4
PING 10.0.0.3 (10.0.0.3): 56 data bytes
64 bytes from 10.0.0.3: seq=0 ttl=64 time=1.020 ms
64 bytes from 10.0.0.3: seq=1 ttl=64 time=0.690 ms
64 bytes from 10.0.0.3: seq=2 ttl=64 time=1.003 ms
64 bytes from 10.0.0.3: seq=3 ttl=64 time=0.918 ms

--- 10.0.0.3 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.690/0.907/1.020 ms

192.168.33.11 - box2(10.0.0.3) -> 192.168.33.10 - box1(10.0.0.2)

[vagrant@192-168-33-11 ~]$ docker exec box2 ping box1 -c 4
PING box1 (10.0.0.2): 56 data bytes
64 bytes from 10.0.0.2: seq=0 ttl=64 time=0.593 ms
64 bytes from 10.0.0.2: seq=1 ttl=64 time=0.691 ms
64 bytes from 10.0.0.2: seq=2 ttl=64 time=0.690 ms
64 bytes from 10.0.0.2: seq=3 ttl=64 time=0.671 ms

--- box1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.593/0.661/0.691 ms
[vagrant@192-168-33-11 ~]$ docker exec box2 ping 10.0.0.2 -c 4
PING 10.0.0.2 (10.0.0.2): 56 data bytes
64 bytes from 10.0.0.2: seq=0 ttl=64 time=0.987 ms
64 bytes from 10.0.0.2: seq=1 ttl=64 time=0.699 ms
64 bytes from 10.0.0.2: seq=2 ttl=64 time=0.769 ms
64 bytes from 10.0.0.2: seq=3 ttl=64 time=1.054 ms

--- 10.0.0.2 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.699/0.877/1.054 ms

可見，采用 overlay 網絡，可以實現不同機器上，docker 容器間的互通。

下面，我們補充之前的圖，用于展示這一實驗中的網絡拓撲情況：

參考鏈接

linux 網絡命名空間 Network namespaces - CSDN博客

Linux虛擬網絡設備之veth - Linux程序員 - SegmentFault 思否

Overlay之VXLAN架構 - CSDN博客

QinQ vs VLAN vs VXLAN

分布式 key-value 存儲系統 etcd 的安裝備忘 - DB.Reid - SegmentFault 思否

GPU云服務器云服務器談談談談域名來談談談談區(qū)塊鏈

文章版權歸作者所有，未經允許請勿轉載,若此文章存在違規(guī)行為，您可以聯系管理員刪除。

轉載請注明本文地址：http://specialneedsforspecialkids.com/yun/27370.html

談談Pod在微服務中的運用

摘要：本文整理自時速云線上微信群分享第十期本文主要包括的基本概念使用場景，以及如何在時速云平臺上進行的編排部署，希望對大家在進行微服務架構實踐時有所幫助。問關于提供訪問容器數據的能力，中包含一個業(yè)務和一個服務，時速云的控制臺可以進入到容器內部。本文整理自【時速云線上微信群分享】第十期本文主要包括Pod的基本概念、使用場景，以及如何在時速云平臺上進行Pod的編排部署，希望對大家在進行微服務...

MASAILA 2019-06-28 15:20 評論0 收藏0
可能是把Docker的概念講的最清楚的一篇文章

摘要：由于隔離的進程獨立于宿主和其它的隔離的進程，因此也稱其為容器。設計時，就充分利用的技術，將其設計為分層存儲的架構。鏡像實際是由多層文件系統聯合組成。分層存儲的特征還使得鏡像的復用定制變的更為容易。前面講過鏡像使用的是分層存儲，容器也是如此。本文只是對Docker的概念做了較為詳細的介紹，并不涉及一些像Docker環(huán)境的安裝以及Docker的一些常見操作和命令。閱讀本文大概需要15分...

Jochen 2019-06-28 16:29 評論0 收藏0
Why Kubernetes ，我所理解的docker與k8s

摘要：去年換工作后，開始真正在生產環(huán)境中接觸容器與。今天想先談談，我理解的容器是什么，以及為什么它們能火起來。一個容器鏡像的實質就是程序進程加所有運行時環(huán)境及配置依賴的集合。這里再談談我理解的。而，就是目前的容器編排的平臺的事實標準了。去年換工作后，開始真正在生產環(huán)境中接觸容器與Kubernetes。邊惡補相關知識的同時，也想把學到的內容和自己的理解整理出來。學習的途徑包括k8s官方文檔...

Taste 2019-06-28 17:03 評論0 收藏0
Why Kubernetes ，我所理解的docker與k8s

摘要：去年換工作后，開始真正在生產環(huán)境中接觸容器與。今天想先談談，我理解的容器是什么，以及為什么它們能火起來。一個容器鏡像的實質就是程序進程加所有運行時環(huán)境及配置依賴的集合。這里再談談我理解的。而，就是目前的容器編排的平臺的事實標準了。去年換工作后，開始真正在生產環(huán)境中接觸容器與Kubernetes。邊惡補相關知識的同時，也想把學到的內容和自己的理解整理出來。學習的途徑包括k8s官方文檔...

maochunguang 2019-07-01 17:05 評論0 收藏0
談談k8s1.12新特性--Mount propagation(掛載命名空間的傳播)

摘要：一個卷的掛載傳播由中的字段控制。此模式等同于內核文檔中描述的掛載傳播。此卷掛載的行為與掛載相同。掛載傳播可能很危險。所謂傳播事件，是指由一個掛載對象的狀態(tài)變化導致的其它掛載對象的掛載與解除掛載動作的事件。 Mount propagation 掛載傳播允許將Container掛載的卷共享到同一Pod中的其他Container，甚至可以共享到同一節(jié)點上的其他Pod。一個卷的掛載傳播由Con...

DTeam 2019-07-01 17:32 評論0 收藏0