Commit Graph

11 Commits

Author SHA1 Message Date
Christian Albrecht
e148cb040b
nixos/kubernetes: Address review: rename node-online target 2019-03-06 17:17:20 +01:00
Christian Albrecht
7323b77435
nixos/kubernetes: Address review: Separate preStart from certificates 2019-03-06 16:55:08 +01:00
Christian Albrecht
6e9037fed0
nixos/kubernetes: Address review: Move bootstrapping addons into own service 2019-03-06 16:54:50 +01:00
Christian Albrecht
51aeaaffc2
nixos/kubernetes: flannel needs iptables in service path 2019-03-03 19:43:13 +01:00
Christian Albrecht
62f03750e4
nixos/kubernetes: Stabilize services startup across machines
by adding targets and curl wait loops to services to ensure services
are not started before their depended services are reachable.

Extra targets cfssl-online.target and kube-apiserver-online.target
syncronize starts across machines and node-online.target ensures
docker is restarted and ready to deploy containers on after flannel
has discussed the network cidr with apiserver.

Since flannel needs to be started before addon-manager to configure
the docker interface, it has to have its own rbac bootstrap service.

The curl wait loops within the other services exists to ensure that when
starting the service it is able to do its work immediately without
clobbering the log about failing conditions.

By ensuring kubernetes.target is only reached after starting the
cluster it can be used in the tests as a wait condition.

In kube-certmgr-bootstrap mkdir is needed for it to not fail to start.

The following is the relevant part of systemctl list-dependencies

default.target
● ├─certmgr.service
● ├─cfssl.service
● ├─docker.service
● ├─etcd.service
● ├─flannel.service
● ├─kubernetes.target
● │ ├─kube-addon-manager.service
● │ ├─kube-proxy.service
● │ ├─kube-apiserver-online.target
● │ │ ├─flannel-rbac-bootstrap.service
● │ │ ├─kube-apiserver-online.service
● │ │ ├─kube-apiserver.service
● │ │ ├─kube-controller-manager.service
● │ │ └─kube-scheduler.service
● │ └─node-online.target
● │   ├─node-online.service
● │   ├─flannel.target
● │   │ ├─flannel.service
● │   │ └─mk-docker-opts.service
● │   └─kubelet.target
● │     └─kubelet.service
● ├─network-online.target
● │ └─cfssl-online.target
● │   ├─certmgr.service
● │   ├─cfssl-online.service
● │   └─kube-certmgr-bootstrap.service
2019-03-03 19:39:02 +01:00
Christian Albrecht
f9e2f76a59
nixos/kubernetes: Add systemd path units
to protect services from crashing and clobbering the logs when
certificates are not in place yet and make sure services are activated
when certificates are ready.

To prevent errors similar to "kube-controller-manager.path: Failed to
enter waiting state: Too many open files"
fs.inotify.max_user_instances has to be increased.
2019-03-03 19:34:57 +01:00
Jaka Hudoklin
97a27fd2d2
nixos/kubernetes: fix flannel and kubelet startup 2019-02-21 00:26:11 +01:00
Franz Pletz
3a02205496
nixos/kubernetes: bootstrap docker without networking
Before flannel is ready there is a brief time where docker will be
running with a default docker0 bridge. If kubernetes happens to spawn
containers before flannel is ready, docker can't be restarted when
flannel is ready because some containers are still running on the
docker0 bridge with potentially different network addresses.

Environment variables in `EnvironmentFile` override those defined via
`Environment` in the systemd service config.

Co-authored-by: Christian Albrecht <christian.albrecht@mayflower.de>
2019-02-20 21:08:58 +01:00
Johan Thomsen
7028fac35b
nixos/kubernetes: use system.path to handle dependency on flannel subnet.env
The current postStart step on flannel causes flannel.service to
sometimes hang, even when it's commanded to stop.
2019-02-20 21:08:56 +01:00
Johan Thomsen
466beb0214
nixos/kubernetes: let flannel use kubernetes as storage backend
+ isolate etcd on the master node by letting it listen only on loopback
+ enabling kubelet on master and taint master with NoSchedule

The reason for the latter is that flannel requires all nodes to be "registered"
in the cluster in order to setup the cluster network. This means that the
kubelet is needed even at nodes on which we don't plan to schedule anything.
2019-02-20 21:08:56 +01:00
Johan Thomsen
e2380e79e1
nixos/kubernetes: major module refactor
- All kubernetes components have been seperated into different files
- All TLS-enabled ports have been deprecated and disabled by default
- EasyCert option added to support automatic cluster PKI-bootstrap
- RBAC has been enforced for all cluster components by default
- NixOS kubernetes test cases make use of easyCerts to setup PKI
2019-02-20 21:08:01 +01:00