k8s與aws--在ec2中部署高可用k8s1.13.1集群(ipvs,cloud-provider

mushang 發(fā)布于2019-07-01 16:56 / 543人閱讀

摘要：前言達到了生產(chǎn)可用，利用部署一個高可用集群簡單不少。部署高可用集群為創(chuàng)建負載均衡這里選擇的。執(zhí)行加入操作驗證部署部署的創(chuàng)建為打標簽標記子網(wǎng)以允許入口控制器自動發(fā)現(xiàn)用于的子網(wǎng)。

前言

kubeadm1.13達到了生產(chǎn)可用，利用kubeadm部署一個高可用集群簡單不少。但是竟然部署在aws上，就要啟用cloud-provider=aws，深度結(jié)合iaas層資源。主要是利用aws的elb和ebs等。相關(guān)的資料還是比較少的，已經(jīng)有的一些文檔要不是out了，要不就是內(nèi)容不全，還有很多文章只是弄了一個demo的水平，完全沒法上生產(chǎn)，部署過程破費周折。

組件版本和集群環(huán)境 集群組件和版本

Kubernetes 1.13.1

Docker 18.06.0-ce

Etcd 3.2.24

Calico 3.4.0 網(wǎng)絡(luò)

集群機器

master:

172.31.22.208

172.31.17.44

172.31.22.135

node:

172.31.29.58

etcd集群非容器部署，systemd守護

三臺master主機配置ssh免密登錄

主機設(shè)置 關(guān)閉防火墻

systemctl stop firewalld
systemctl disable firewalld

禁用selinux

# Set SELinux in permissive mode (effectively disabling it)
setenforce 0
sed -i "s/^SELINUX=enforcing$/SELINUX=permissive/" /etc/selinux/config

啟用net.bridge.bridge-nf-call-ip6tables和net.bridge.bridge-nf-call-iptables

cat <  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
vm.swappiness=0
EOF
sysctl --system

禁用swap

swapoff -a

修改/etc/fstab 文件，注釋掉 SWAP 的自動掛載.
使用free -m確認swap已經(jīng)關(guān)閉。

加載ipvs相關(guān)模塊

at > /etc/sysconfig/modules/ipvs.modules <
上面腳本創(chuàng)建了的/etc/sysconfig/modules/ipvs.modules文件，保證在節(jié)點重啟后能自動加載所需模塊。 使用lsmod | grep -e ip_vs -e nf_conntrack_ipv4命令查看是否已經(jīng)正確加載所需的內(nèi)核模塊。
接下來還需要確保各個節(jié)點上已經(jīng)安裝了ipset軟件包yum install ipset。 為了便于查看ipvs的代理規(guī)則，最好安裝一下管理工具ipvsadm yum install ipvsadm。
賦予IAM權(quán)限
Master Policy
  {
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "autoscaling:DescribeAutoScalingGroups",
        "autoscaling:DescribeLaunchConfigurations",
        "autoscaling:DescribeTags",
        "ec2:DescribeInstances",
        "ec2:DescribeRegions",
        "ec2:DescribeRouteTables",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSubnets",
        "ec2:DescribeVolumes",
        "ec2:CreateSecurityGroup",
        "ec2:CreateTags",
        "ec2:CreateVolume",
        "ec2:ModifyInstanceAttribute",
        "ec2:ModifyVolume",
        "ec2:AttachVolume",
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:CreateRoute",
        "ec2:DeleteRoute",
        "ec2:DeleteSecurityGroup",
        "ec2:DeleteVolume",
        "ec2:DetachVolume",
        "ec2:RevokeSecurityGroupIngress",
        "ec2:DescribeVpcs",
        "elasticloadbalancing:AddTags",
        "elasticloadbalancing:AttachLoadBalancerToSubnets",
        "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",
        "elasticloadbalancing:CreateLoadBalancer",
        "elasticloadbalancing:CreateLoadBalancerPolicy",
        "elasticloadbalancing:CreateLoadBalancerListeners",
        "elasticloadbalancing:ConfigureHealthCheck",
        "elasticloadbalancing:DeleteLoadBalancer",
        "elasticloadbalancing:DeleteLoadBalancerListeners",
        "elasticloadbalancing:DescribeLoadBalancers",
        "elasticloadbalancing:DescribeLoadBalancerAttributes",
        "elasticloadbalancing:DetachLoadBalancerFromSubnets",
        "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
        "elasticloadbalancing:ModifyLoadBalancerAttributes",
        "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
        "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer",
        "elasticloadbalancing:AddTags",
        "elasticloadbalancing:CreateListener",
        "elasticloadbalancing:CreateTargetGroup",
        "elasticloadbalancing:DeleteListener",
        "elasticloadbalancing:DeleteTargetGroup",
        "elasticloadbalancing:DescribeListeners",
        "elasticloadbalancing:DescribeLoadBalancerPolicies",
        "elasticloadbalancing:DescribeTargetGroups",
        "elasticloadbalancing:DescribeTargetHealth",
        "elasticloadbalancing:ModifyListener",
        "elasticloadbalancing:ModifyTargetGroup",
        "elasticloadbalancing:RegisterTargets",
        "elasticloadbalancing:SetLoadBalancerPoliciesOfListener",
        "iam:CreateServiceLinkedRole",
        "kms:DescribeKey"
      ],
      "Resource": [
        "*"
      ]
    },
  ]
}
Node Policy
  {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Action": [
                  "ec2:DescribeInstances",
                  "ec2:DescribeRegions",
                  "ecr:GetAuthorizationToken",
                  "ecr:BatchCheckLayerAvailability",
                  "ecr:GetDownloadUrlForLayer",
                  "ecr:GetRepositoryPolicy",
                  "ecr:DescribeRepositories",
                  "ecr:ListImages",
                  "ecr:BatchGetImage",
                  "sts:AssumeRole"
              ],
              "Resource": "*"
          } 
      ]
  }
tag標簽
需要為ec2實例, route table, subnet，安全組 打下面的標簽：
kubernetes.io/cluster/="owned"
cluster-name命名規(guī)范：
k8s-{region}-{env}-{num}
例如：
k8s-usa-west-2-test-1
安裝docker和kubeadm
安裝指定版本docker
安裝docker
yum install docker-18.06.1ce-5.amzn2 -y

systemctl enable docker
更改docker Root Dir 目錄
將/var/lib/dokcer 配置到/data/docker,確保/data是另外掛載的數(shù)據(jù)盤
更改 ‘/etc/sysconfig/docker’ 文件:
OPTIONS="--default-ulimit nofile=1024:4096 -g /data/docker" 
更改 /etc/docker/daemon.json:
cat >  /etc/docker/daemon.json <
驗證
[root@ip-172-31-22-208 ~]# ls -lrt /var/lib/docker
總用量 0
[root@ip-172-31-22-208 ~]# ls -lrt /data/docker/
總用量 0
drwx------ 3 root root 20 12月 11 10:44 containerd
drwx------ 2 root root  6 12月 11 10:44 tmp
drwx------ 2 root root  6 12月 11 10:44 runtimes
drwx------ 4 root root 32 12月 11 10:44 plugins
drwx------ 2 root root  6 12月 11 10:44 containers
drwx------ 2 root root 25 12月 11 10:44 volumes
drwx------ 3 root root 22 12月 11 10:44 image
drwx------ 2 root root  6 12月 11 10:44 trust
drwxr-x--- 3 root root 19 12月 11 10:44 network
drwx------ 3 root root 40 12月 11 10:44 overlay2
drwx------ 2 root root  6 12月 11 10:44 swarm
drwx------ 2 root root 24 12月 11 10:44 builder
drwx------ 4 root root 92 12月 11 10:44 buildkit
重啟docker 服務(wù)
systemctl start docker
驗證docker：
root@ip-172-31-22-208 ~]# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.14.70-72.55.amzn2.x86_64
Operating System: Amazon Linux 2
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.67GiB
Name: ip-172-31-22-208.us-west-2.compute.internal
ID: CG7S:P5XD:FLU6:MULI:2TSI:OLRY:A6EX:SM3D:FXNB:CMEQ:MU6R:XSCW
Docker Root Dir: /data/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
安裝kubeadm等
增加k8s repo
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF
安裝kubeadm, kubelet, kubectl
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

systemctl enable kubelet && systemctl start kubelet
驗證kubeadm版本
[root@ip-172-31-22-208 ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.0", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"clean", BuildDate:"2018-12-03T21:02:01Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
更新 kubelet config
設(shè)置節(jié)點預(yù)留資源 ,同時為了支持 cloud-provider， 首先需要在 kubelet 的配置里做相應(yīng)修改，為 /etc/sysconfig/kubelet 添加 KUBELET_EXTRA_ARGS:
KUBELET_EXTRA_ARGS=--cloud-provider=aws 
預(yù)留資源
設(shè)置cgroups
mkdir -p /sys/fs/cgroup/cpu/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/cpuacct/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/memory/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/devices/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/blkio/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
mkdir -p /sys/fs/cgroup/systemd/system.slice/kubelet.service
在/var/lib/kubelet/config.yaml中添加如下：
enforceNodeAllocatable:
- pods
- kube-reserved
- system-reserved
kubeReservedCgroup: /system.slice/kubelet.service
systemReservedCgroup: /system.slice
systemReserved:
  cpu: 500m
  memory: 1Gi
  ephemeral-storage: 5Gi
kubeReserved:
  cpu: 500m
  memory: 1Gi
  ephemeral-storage: 5Gi
部署高可用 etcd 集群
kuberntes 系統(tǒng)使用 etcd 存儲所有數(shù)據(jù)，本文檔介紹部署一個三節(jié)點高可用 etcd 集群的步驟，這三個節(jié)點復(fù)用 kubernetes master 機器，分別命名為etcd-host0、etcd-host1、etcd-host2：

infra0: 172.31.22.208
infra1: 172.31.17.44
infra2: 172.31.22.135

使用的變量
本文檔用到的變量定義如下：
export NODE_NAME=infra0 # 當前部署的機器名稱(隨便定義，只要能區(qū)分不同機器即可)
export NODE_IP=172.31.22.208 # 當前部署的機器 IP
export NODE_IPS="172.31.22.208 172.31.17.44 172.31.22.135" # etcd 集群所有機器 IP
# etcd 集群間通信的IP和端口
export ETCD_NODES=infra0=https://172.31.22.208:2380,infra1=https://172.31.17.44:2380,infra2=https://172.31.22.135:2380

export NODE_NAME=infra1 # 當前部署的機器名稱(隨便定義，只要能區(qū)分不同機器即可)
export NODE_IP=172.31.17.44 # 當前部署的機器 IP
export NODE_IPS="172.31.22.208 172.31.17.44 172.31.22.135" # etcd 集群所有機器 IP
# etcd 集群間通信的IP和端口
export ETCD_NODES=infra0=https://172.31.22.208:2380,infra1=https://172.31.17.44:2380,infra2=https://172.31.22.135:2380

export NODE_NAME=infra2 # 當前部署的機器名稱(隨便定義，只要能區(qū)分不同機器即可)
export NODE_IP=172.31.22.135 # 當前部署的機器 IP
export NODE_IPS="172.31.22.208 172.31.17.44 172.31.22.135" # etcd 集群所有機器 IP
# etcd 集群間通信的IP和端口
export ETCD_NODES=infra0=https://172.31.22.208:2380,infra1=https://172.31.17.44:2380,infra2=https://172.31.22.135:2380

下載二進制文件
到 https://github.com/coreos/etcd/releases 頁面下載最新版本的二進制文件：
wget https://github.com/coreos/etcd/releases/download/v3.2.24/etcd-v3.2.24-linux-amd64.tar.gz
tar -xvf etcd-v3.2.24-linux-amd64.tar.gz
mv etcd-v3.2.24-linux-amd64/etcd* /usr/bin

利用kubeadm創(chuàng)建秘鑰和證書
為kubeadm創(chuàng)建配置文件
使用以下腳本為每個將在其上運行etcd成員的主機生成一個kubeadm配置文件。
# Update HOST0, HOST1, and HOST2 with the IPs or resolvable names of your hosts
export HOST0=172.31.22.208
export HOST1=172.31.17.44
export HOST2=172.31.22.135

# Create temp directories to store files that will end up on other hosts.
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/ /tmp/${HOST2}/

ETCDHOSTS=(${HOST0} ${HOST1} ${HOST2})
NAMES=("infra0" "infra1" "infra2")

for i in "${!ETCDHOSTS[@]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
apiVersion: "kubeadm.k8s.io/v1beta1"
kind: ClusterConfiguration
etcd:
    local:
        serverCertSANs:
        - "${HOST}"
        peerCertSANs:
        - "${HOST}"
        extraArgs:
            initial-cluster: infra0=https://${ETCDHOSTS[0]}:2380,infra1=https://${ETCDHOSTS[1]}:2380,infra2=https://${ETCDHOSTS[2]}:2380
            initial-cluster-state: new
            name: ${NAME}
            listen-peer-urls: https://${HOST}:2380
            listen-client-urls: https://${HOST}:2379
            advertise-client-urls: https://${HOST}:2379
            initial-advertise-peer-urls: https://${HOST}:2380
EOF
done
生成證書頒發(fā)機構(gòu)
執(zhí)行如下命令：
kubeadm init phase certs etcd-ca
生成下面兩個文件：

/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/ca.key

為每個成員創(chuàng)建證書
kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST2}/
# cleanup non-reusable certificates
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete

kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete

kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
# No need to move the certs because they are for HOST0

# clean up certs that should not be copied off this host
find /tmp/${HOST2} -name ca.key -type f -delete
find /tmp/${HOST1} -name ca.key -type f -delete
拷貝證書到對應(yīng)的主機上
USER=root
CONTROL_PLANE_IPS="172.31.17.44 172.31.22.135"
for host in ${CONTROL_PLANE_IPS}; do
    scp -r /tmp/${host}/pki "${USER}"@$host:
done
$ 例如HOST0上所需文件的完整列表是：
/etc/kubernetes/pki
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
└── etcd
    ├── ca.crt
    ├── ca.key
    ├── healthcheck-client.crt
    ├── healthcheck-client.key
    ├── peer.crt
    ├── peer.key
    ├── server.crt
    └── server.key
其他兩臺主機如上。
創(chuàng)建 etcd 的 systemd unit 文件
mkdir -p /var/lib/etcd  # 必須先創(chuàng)建工作目錄
cat > etcd.service <

指定 etcd 的工作目錄和數(shù)據(jù)目錄為 /var/lib/etcd，需在啟動服務(wù)前創(chuàng)建這個目錄；
為了保證通信安全，需要指定 etcd 的公私鑰(cert-file和key-file)、Peers 通信的公私鑰和 CA 證書(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客戶端的CA證書（trusted-ca-file）；

--initial-cluster-state 值為 new 時，--name 的參數(shù)值必須位于 --initial-cluster 列表中；

啟動 etcd 服務(wù)
mv etcd.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd
$
最先啟動的 etcd 進程會卡住一段時間，等待其它節(jié)點上的 etcd 進程加入集群，為正常現(xiàn)象。
在所有的 etcd 節(jié)點重復(fù)上面的步驟，直到所有機器的 etcd 服務(wù)都已啟動。
驗證服務(wù)
部署完 etcd 集群后，在任一 etcd 集群節(jié)點上執(zhí)行如下命令：
for ip in ${NODE_IPS}; do
  ETCDCTL_API=3 /usr/bin/etcdctl 
  --endpoints=https://${ip}:2379  
  --cacert=/etc/kubernetes/pki/etcd/ca.crt 
  --cert=/etc/kubernetes/pki/etcd/server.crt 
  --key=/etc/kubernetes/pki/etcd/server.key 
  endpoint health; 
  done
預(yù)期結(jié)果：
https://172.31.22.208:2379 is healthy: successfully committed proposal: took = 1.543275ms
https://172.31.17.44:2379 is healthy: successfully committed proposal: took = 1.883033ms
https://172.31.22.135:2379 is healthy: successfully committed proposal: took = 2.026367ms
三臺 etcd 的輸出均為 healthy 時表示集群服務(wù)正常（忽略 warning 信息）。
部署高可用 master集群
為kube-apiserver創(chuàng)建tcp負載均衡
這里選擇aws的nlb。具體創(chuàng)建過程不再敘述。
創(chuàng)建結(jié)果nlb-sgt-k8sapiserver-test-4748f2f556591bb7.elb.us-west-2.amazonaws.com。
添加到變量
export LOAD_BALANCER_DNS=nlb-sgt-k8sapiserver-test-4748f2f556591bb7.elb.us-west-2.amazonaws.com
export ETCD_0_IP=172.31.22.208
export ETCD_1_IP=172.31.17.44
export ETCD_2_IP=172.31.22.135
創(chuàng)建 啟用aws cloud-provider
cat > kubeadm-config.yaml <
創(chuàng)建 不啟用aws cloud-provider
cat > kubeadm-config.yaml <
創(chuàng)建第一個master
執(zhí)行
kubeadm init --config=kubeadm-config.yaml
出現(xiàn)以下:
設(shè)置訪問證書：
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

創(chuàng)建剩余master
復(fù)制證書
USER=root # customizable
CONTROL_PLANE_IPS="172.31.17.44 172.31.22.135"
for host in ${CONTROL_PLANE_IPS}; do
    scp /etc/kubernetes/pki/ca.crt "${USER}"@$host:
    scp /etc/kubernetes/pki/ca.key "${USER}"@$host:
    scp /etc/kubernetes/pki/sa.key "${USER}"@$host:
    scp /etc/kubernetes/pki/sa.pub "${USER}"@$host:
    scp /etc/kubernetes/pki/front-proxy-ca.crt "${USER}"@$host:
    scp /etc/kubernetes/pki/front-proxy-ca.key "${USER}"@$host:
    scp /etc/kubernetes/admin.conf "${USER}"@$host:
done
在剩余主機執(zhí)行：
USER=root # customizable
mv /${USER}/ca.crt /etc/kubernetes/pki/
mv /${USER}/ca.key /etc/kubernetes/pki/
mv /${USER}/sa.pub /etc/kubernetes/pki/
mv /${USER}/sa.key /etc/kubernetes/pki/
mv /${USER}/front-proxy-ca.crt /etc/kubernetes/pki/
mv /${USER}/front-proxy-ca.key /etc/kubernetes/pki/
mv /${USER}/admin.conf /etc/kubernetes/admin.conf
加入控制節(jié)點：
kubeadm join nlb-sgt-k8sapiserver-test-4748f2f556591bb7.elb.us-west-2.amazonaws.com:6443 --token u9hmb3.gwfozvsz90k3yt9g --discovery-token-ca-cert-hash sha256:24c354cce46de9c1eb1a8358b9ba064166e87cf6c011fecaae3350c3910c215a  --experimental-control-plane
忘記discovery-token-ca-cert-hash？
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed "s/^.* //"
部署calico網(wǎng)絡(luò)
檢查aws ec2
是否關(guān)閉了src/dst checks？
配置calicoctl
下載calicoctl
curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.4.0/calicoctl
chmod +x calicoctl
mv calicoctl /usr/bin/
配置calico config 文件
cat > /etc/calico/calicoctl.cfg <
使用到的變量
export ETCD_KEY=$(cat /etc/kubernetes/pki/etcd/server.key | base64 | tr -d "
")
export ETCD_CERT=$(cat /etc/kubernetes/pki/etcd/server.crt | base64 | tr -d "
")
export ETCD_CA=$(cat /etc/kubernetes/pki/etcd/ca.crt | base64 | tr -d "
")
創(chuàng)建calico.yml
cat > calico.yml < | base64 -w 0
  etcd-key: ${ETCD_KEY}
  etcd-cert: ${ETCD_CERT}
  etcd-ca: ${ETCD_CA}

---
# This manifest installs the calico/node container, as well
# as the Calico CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        # This, along with the CriticalAddonsOnly toleration below,
        # marks the pod as a critical add-on, ensuring it gets
        # priority scheduling and that its resources are reserved
        # if it ever gets evicted.
        scheduler.alpha.kubernetes.io/critical-pod: ""
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      hostNetwork: true
      tolerations:
        # Make sure calico-node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      initContainers:
        # This container installs the Calico CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: quay.io/calico/cni:v3.4.0
          command: ["/install-cni.sh"]
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # CNI MTU Config variable
            - name: CNI_MTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # Prevents the container from sleeping forever.
            - name: SLEEP
              value: "false"
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
            - mountPath: /calico-secrets
              name: etcd-certs
      containers:
        # Runs calico/node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: quay.io/calico/node:v3.4.0
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Set noderef for node controller.
            - name: CALICO_K8S_NODE_REF
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            # Enable IPIP
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: veth_mtu
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            # no effect. This should fall within --cluster-cidr.
            - name: CALICO_IPV4POOL_CIDR
              value: "192.168.0.0/16"
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            # Set Felix logging to "info"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9099
              host: localhost
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
          readinessProbe:
            exec:
              command:
              - /bin/calico-node
              - -bird-ready
              - -felix-ready
            periodSeconds: 10
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /run/xtables.lock
              name: xtables-lock
              readOnly: false
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - mountPath: /calico-secrets
              name: etcd-certs
      volumes:
        # Used by calico/node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
        - name: xtables-lock
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-node
  namespace: kube-system

---
# This manifest deploys the Calico Kubernetes controllers.
# See https://github.com/projectcalico/kube-controllers
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
    spec:
      nodeSelector:
        beta.kubernetes.io/os: linux
      # The controllers must run in the host network namespace so that
      # it isn"t governed by policy that would prevent it from working.
      hostNetwork: true
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      containers:
        - name: calico-kube-controllers
          image: quay.io/calico/kube-controllers:v3.4.0
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: policy,namespace,serviceaccount,workloadendpoint,node
          volumeMounts:
            # Mount in the etcd TLS secrets.
            - mountPath: /calico-secrets
              name: etcd-certs
          readinessProbe:
            exec:
              command:
              - /usr/bin/check-status
              - -r
      volumes:
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-kube-controllers
  namespace: kube-system
---

# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: calico-kube-controllers
rules:
  # Pods are monitored for changing labels.
  # The node controller monitors Kubernetes nodes.
  # Namespace and serviceaccount labels are used for policy.
  - apiGroups:
      - ""
    resources:
      - pods
      - nodes
      - namespaces
      - serviceaccounts
    verbs:
      - watch
      - list
  # Watch for changes to Kubernetes NetworkPolicies.
  - apiGroups:
      - networking.k8s.io
    resources:
      - networkpolicies
    verbs:
      - watch
      - list
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: calico-kube-controllers
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-kube-controllers
subjects:
- kind: ServiceAccount
  name: calico-kube-controllers
  namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: calico-node
rules:
  # The CNI plugin needs to get pods, nodes, and namespaces.
  - apiGroups: [""]
    resources:
      - pods
      - nodes
      - namespaces
    verbs:
      - get
  - apiGroups: [""]
    resources:
      - endpoints
      - services
    verbs:
      # Used to discover service IPs for advertisement.
      - watch
      - list
  - apiGroups: [""]
    resources:
      - nodes/status
    verbs:
      # Needed for clearing NodeNetworkUnavailable flag.
      - patch
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: calico-node
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: calico-node
subjects:
- kind: ServiceAccount
  name: calico-node
  namespace: kube-system
---

EOF
部署calico
kubectl apply -f calico.yml
設(shè)置ippool
執(zhí)行：
calicoctl apply -f - << EOF
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  name: default-ipv4-ippool
spec:
  cidr: 192.168.0.0/16
  ipipMode: CrossSubnet
  natOutgoing: true
EOF
部署node節(jié)點
執(zhí)行主機設(shè)置的所有項。
執(zhí)行加入操作：
kubeadm join nlb-sgt-k8sapiserver-test-4748f2f556591bb7.elb.us-west-2.amazonaws.com:6443 --token u9hmb3.gwfozvsz90k3yt9g --discovery-token-ca-cert-hash sha256:24c354cce46de9c1eb1a8358b9ba064166e87cf6c011fecaae3350c3910c215a

驗證：
 kubectl get nodes
NAME                                          STATUS   ROLES    AGE     VERSION
ip-172-31-17-44.us-west-2.compute.internal    Ready    master   4m2s    v1.13.0
ip-172-31-22-135.us-west-2.compute.internal   Ready    master   3m59s   v1.13.0
ip-172-31-22-208.us-west-2.compute.internal   Ready    master   16h     v1.13.0
ip-172-31-29-58.us-west-2.compute.internal    Ready       14h     v1.13.0
部署addon
部署aws的sts
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/storage-class/aws/default.yaml
創(chuàng)建alb-ingress-controller
為subnet打標簽
標記AWS子網(wǎng)以允許入口控制器自動發(fā)現(xiàn)用于ALB的子網(wǎng)。

kubernetes.io/cluster/${cluster-name} must be set to owned or shared
kubernetes.io/role/internal-elb must be set to 1 or `` for internal LoadBalancers
kubernetes.io/role/elb must be set to 1 or `` for internet-facing LoadBalancers

rbac
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/aws-alb-ingress-controller/v1.0.1/docs/examples/rbac-role.yaml
按照如下yaml創(chuàng)建
# Application Load Balancer (ALB) Ingress Controller Deployment Manifest.
# This manifest details sensible defaults for deploying an ALB Ingress Controller.
# GitHub: https://github.com/kubernetes-sigs/aws-alb-ingress-controller
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: alb-ingress-controller
  name: alb-ingress-controller
  # Namespace the ALB Ingress Controller should run in. Does not impact which
  # namespaces it"s able to resolve ingress resource for. For limiting ingress
  # namespace scope, see --watch-namespace.
  namespace: kube-system
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
spec:
  replicas: 1
  selector:
    matchLabels:
      app: alb-ingress-controller
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      annotations:
        iam.amazonaws.com/role: arn:aws:iam::1234567:role/Role-KubernetesIngressController-test
      labels:
        app: alb-ingress-controller
    spec:
      containers:
        - args:
            # Limit the namespace where this ALB Ingress Controller deployment will
            # resolve ingress resources. If left commented, all namespaces are used.
            # - --watch-namespace=your-k8s-namespace

            # Setting the ingress-class flag below ensures that only ingress resources with the
            # annotation kubernetes.io/ingress.class: "alb" are respected by the controller. You may
            # choose any class you"d like for this controller to respect.
            - --ingress-class=alb

            # Name of your cluster. Used when naming resources created
            # by the ALB Ingress Controller, providing distinction between
            # clusters.
            - --cluster-name=k8s-us-west-test-1

            # AWS VPC ID this ingress controller will use to create AWS resources.
            # If unspecified, it will be discovered from ec2metadata.
            # - --aws-vpc-id=vpc-xxxxxx

            # AWS region this ingress controller will operate in.
            # If unspecified, it will be discovered from ec2metadata.
            # List of regions: http://docs.aws.amazon.com/general/latest/gr/rande.html#vpc_region
            # - --aws-region=us-west-1

            # Enables logging on all outbound requests sent to the AWS API.
            # If logging is desired, set to true.
            # - ---aws-api-debug
            # Maximum number of times to retry the aws calls.
            # defaults to 10.
            # - --aws-max-retries=10
          env:
            # AWS key id for authenticating with the AWS API.
            # This is only here for examples. It"s recommended you instead use
            # a project like kube2iam for granting access.
            #- name: AWS_ACCESS_KEY_ID
            #  value: KEYVALUE

            # AWS key secret for authenticating with the AWS API.
            # This is only here for examples. It"s recommended you instead use
            # a project like kube2iam for granting access.
            #- name: AWS_SECRET_ACCESS_KEY
            #  value: SECRETVALUE
          # Repository location of the ALB Ingress Controller.
          image: 894847497797.dkr.ecr.us-west-2.amazonaws.com/aws-alb-ingress-controller:v1.0.1
          imagePullPolicy: Always
          name: server
          resources: {}
          terminationMessagePath: /dev/termination-log
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      securityContext: {}
      terminationGracePeriodSeconds: 30
      serviceAccountName: alb-ingress
      serviceAccount: alb-ingress
注意cluster-name 指定集群name。
創(chuàng)建dashbord
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

需要創(chuàng)建一個admin用戶并授予admin角色綁定，使用下面的yaml文件創(chuàng)建admin用戶并賦予他管理員權(quán)限，然后可以通過token登陸dashbaord，該文件見admin-role.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: admin
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile

獲取token
kubectl -n kube-system get secret|grep admin-token
admin-token-cs4gs                                kubernetes.io/service-account-token   3      10m

kubectl describe secret admin-token-cs4gs -n kube-system
重新部署操作
kubeadm reset

iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

ipvsadm --clear

ifconfig tunl0 down

ip link delete tunl0

升級kubeadm等
升級kubeadm
export VERSION=$(curl -sSL https://dl.k8s.io/release/stable.txt) # or manually specify a released Kubernetes version
export ARCH=amd64 # or: arm, arm64, ppc64le, s390x
curl -sSL https://dl.k8s.io/release/${VERSION}/bin/linux/${ARCH}/kubeadm > /usr/bin/kubeadm
chmod a+rx /usr/bin/kubeadm
升級kubectl
export VERSION=$(curl -sSL https://dl.k8s.io/release/stable.txt) # or manually specify a released Kubernetes version
export ARCH=amd64 # or: arm, arm64, ppc64le, s390x
curl -sSL https://dl.k8s.io/release/${VERSION}/bin/linux/${ARCH}/kubectl > /usr/bin/kubectl
chmod a+rx /usr/bin/kubectl
升級kubelet
export VERSION=$(curl -sSL https://dl.k8s.io/release/stable.txt) # or manually specify a released Kubernetes version
export ARCH=amd64 # or: arm, arm64, ppc64le, s390x
curl -sSL https://dl.k8s.io/release/${VERSION}/bin/linux/${ARCH}/kubelet > /usr/bin/kubelet
chmod a+rx /usr/bin/kubelet