motivation

Lots of crucial steps in the SUSE/openSUSE product build workflow are not known or not even accessible by many people but hidden as "custom scripts" on "some machine" in the worst cases. Nowadays infrastructure and build pipelines should be written as code, e.g. in git repos, with UI frontends to show the always current state of what is going on, what needs to be done to release products, where are problems. The least we can do is learn a bit more in this direction.

goals

  • G1: openSUSE contributors acting as release managers for openSUSE ports have access to all code that is executed for building, publishing, syncing, triggering
  • G2: Anyone can find the current status of builds+tests without needing to ask people to login to VMs over ssh
  • G3: The workflows of simpler products, e.g. openSUSE Krypton, are fully described in live pipeline views

execution

My personal plan is to execute the following steps in roughly this order and priority not expecting to be able to complete all of these steps on my own at all:

  • Learn about the topic "infrastructure-by-code" – or is it "…-in-code", "…-as-code"? – and see how it is applicable to the usual openSUSE product build flows
  • Install gitlab and try out the CI component and pipeline, maybe based on kubernetes
  • Compare to the jenkins pipeline DSL, potentially try out to replace existing custom jenkins jobs on http://lord.arch.suse.de:8080 with DSL described variants
  • Move the file /etc/crontab from lord.arch to some external repo, potentially move the services to another machine
  • What's with kubernetes, docker, docker-compose, salt, jenkins, cloudconfig, etcd, AMQP? Are some competing solutions or use all or what?
  • Incorporate a full build+test+publish pipeline into a CI system at least monitoring existing steps from a high-level
  • As follow-up to previous job convert more "custom" steps which are just monitored to be triggered in the same way

current status and findings

starting situation

There is OBS which can be triggered by github projects but AFAIU only by people having access to a repo so being a package maintainer without being upstream maintainer is a problem. So somehow we have to run cron-jobs or systemd times or custom jenkins jobs to poll for upstream package changes. In case of the bigger SUSE/openSUSE products building e.g. repos and isos the process is more complicated and involves e.g. custom "autobuild-team-scripts" and I don't know how to even check the current status in case we don't know if any step succeeded or failed?

research

playground

  • Talked with jdsn and bmwiedemann: Talking about jenkins it makes sense to use jenkins-job-builder to define all jenkins jobs as yaml in a git repo and deploy all jenkins jobs from that git repo with job definitions.

  • Played around with gitlab-runner:

    • Add runner to lord.arch

zypper ar -f -p 105 https://download.opensuse.org/repositories/home![add-emoji](https://assets-cdn.github.com/images/icons/emoji/darix.png)apps/openSUSE_Leap_42.3/home![add-emoji](https://assets-cdn.github.com/images/icons/emoji/darix.png)apps.repo && zypper in gitlab-runner

  • Take registration token from https://gitlab.suse.de/okurz/scripts/settings/ci_cd and register with

env CI_SERVER_URL=https://gitlab.suse.de RUNNER_NAME=lord.arch RUNNER_EXECUTOR=docker DOCKER_IMAGE=mini/base REGISTRATION_TOKEN=XXX REGISTER_NON_INTERACTIVE=true /usr/sbin/gitlab-runner register

  • Enable docker and gitlab service for i in docker gitlab-runner; do systemctl enable --now $i ; done
  • Enable auto-devops on okurz/scripts to try out

https://gitlab.suse.de/okurz/scripts/-/jobs/15983 failed with "Cannot connect to the Docker daemon at unix:///var/run/docker.sock", had to forward the docker socket into the container, e.g. in /etc/gitlab-runner/config.toml: volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"] . Probably could use "shell" executor instead. But maybe that is also related to how auto-devops builds here in this shell/perl project. On my primary workstation notebook I installed gitlab-runner as well to execute tests locally by using the exec command. I created a symlink "gitlab-runner-local" to ~/bin and tried it out on os-autoinst-needles-sles. Works fine but is not that efficient because it clones the repo from which tests are executed another time. However, it's a better test because we actually check the state in git.

Now having a runner and a way to test locally is good. I enabled my runner for openqa/scripts and came up with CI tests which already discovered two problems: Missing dependencies in the package openQA-client and weird dependencies in our openSUSE package rsync

Installing minikube on my workstation notebook:

zypper ar -p 105 -f https://download.opensuse.org/repositories/Virtualization:containers/openSUSE_Leap_15.0/Virtualization:containers.repo && zypper --gpg-auto-import-keys ref && zypper -n in minikube && sudo minikube start --vm-driver kvm2

Could not find kubectl in proper packages which are also up-to-date, installed as recommended when calling minikube start:

curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.10.0/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/

Could not successfully get it to work on my notebook for long, following instructions

sudo minikube start --vm-driver none --apiserver-ips 127.0.0.1 --apiserver-name localhost

shows

Starting local Kubernetes v1.10.0 cluster... Starting VM... Getting VM IP address... Moving files into cluster... Setting up certs... Connecting to cluster... Setting up kubeconfig... Starting cluster components... Kubectl is now configured to use the cluster.

and tools like sudo minikube status show "minikube: Running, cluster: Running, kubectl: Correctly Configured: pointing to minikube-vm at 10.160.67.191" so looks fine. sudo /usr/local/bin/kubectl get pods also looks ok: "No resources found." and also sudo /usr/local/bin/kubectl cluster-info gives "Kubernetes master is running at https://10.160.67.191:8443, KubeDNS is running at https://10.160.67.191:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy".

But then sudo minikube dashboard shows repeatedly "Waiting, endpoint for service is not ready yet..." and then finally "Error validating service: Error getting service kubernetes-dashboard: Get https://10.160.67.191:8443/api/v1/namespaces/kube-system/services/kubernetes-dashboard: dial tcp 10.160.67.191add-emoji connect: connection refused" which seems to break any subsequent call sudo /usr/local/bin/kubectl cluster-info Kubernetes master is running at https://10.160.67.191:8443 shows "The connection to the server 10.160.67.191:8443 was refused - did you specify the right host or port?".

It seems like the cluster works for some seconds and then dies but I could not identify which component it is. Trying on a different machine with different installation …

Trying on lord.arch which is still openSUSE Leap 42.3 I realized that installing minikube from the unofficial repo also pulls in kubernetes-client from the main repo so Leap 42.3 still had kubernetes-client and therefore kubectl but openSUSE Leap 15.0 does not have it anymore (and also not SLE-15), created boo#1101010 for that.

Starting the cluster works fine

sudo minikube start --vm-driver=none

causes many docker containers to be spawned. Trying to access the cluster locally with kubectl does not seem to work ok though, kubectl cluster-info gives:

error: provided data does not appear to be a protobuf message, expected prefix [107 56 115 0]

so trying to access the cluster from my notebook. For this to work I need the config, the access information, the certificate(s):

rsync -aHP --rsync-path="sudo rsync" lord.arch:/root/.kube/ .kube/ && rsync -aHP --rsync-path="sudo rsync" lord.arch:/root/.minikube/ .minikube/ --exclude=cache/

and replace all references from /root to /home/okurz.

Now the same command from my notebook, which is the current git version works fine: Maybe a version mismatch? kubectl version reports the same version for client and server on notebook but the one year older version "1.3.10" for the client on lord.arch. Downloading the latest version with curl as above fixes that.

So

kubectl cluster-info && kubectl get pods --all-namespaces

worked.

Finally run-kubernetes-locally-using-minikube brought me on the right track:

kubectl run webserver --image=nginx:alpine && kubectl expose deployment webserver --type=LoadBalancer --port=80

But at least in my infrastructure IPv6 did not work so

nmap -4 -p 31918 lord.arch && curl -4 http://lord.arch:31918

I have found "kubernetes playgrounds" which might be interesting for others to try as well, e.g. katacoda

status at end of hackweek17

  • Configured a gitlab runner, connected to okurz/scripts, added self-tests in merge request
  • Experimented with the gitlab CI component and pipeline
  • Setup my own kubernetes installation with minikube, created services on this deployment
  • Learned about "docker swarm" and "docker stacks" and how they compare to what kubernetes offers
  • none of the initial goals reached yet

side-tracking progress

Either due to procrastination or reactive repriorization I also did some other tasks during hack week and also learned something:

  • Converted some packages in OBS from using the deprecated tarscm to obsscm with "tar" and "recompress" being executed at "buildtime" and stripping the "v"-prefix in versions, e.g. see sr#621907
  • Updated matrix_synapse to version 0.32.2 to also include recent security fixes
  • Converted pyton-openqareview package to use the local relative source package file instead of a URL since otherwise `sourcevalidatoranddownload_files` on submission to openSUSE Factory would complain even though unfortunately I can not locally reproduce this.
  • Conducted a workshop for half a day
  • My email client "kmail" acts up, does not refresh IMAP folders and also does not receive new emails after some time until I try to restart/refresh/reconfigure everything

references

Related to a ticket on the backlog of SUSE QSF (QA SLE functional) about [functional][epic] improve openqa triggering mechanisms, standardize OBS/IBS deliverables structure, trigger jobs using other means


Comments

  • okurz
    8 months ago by okurz | Reply

    • Talked with jdsn and bmwiedemann: Talking about jenkins it makes sense to use jenkins-job-builder to define all jenkins jobs as yaml in a git repo and deploy all jenkins jobs from that git repo with job definitions.

    • Played around with gitlab-runner:

      • Add runner to lord.arch

    zypper ar -f -p 105 https://download.opensuse.org/repositories/home![add-emoji](https://assets-cdn.github.com/images/icons/emoji/darix.png)apps/openSUSE_Leap_42.3/home![add-emoji](https://assets-cdn.github.com/images/icons/emoji/darix.png)apps.repo && zypper in gitlab-runner

    • Take registration token from https://gitlab.suse.de/okurz/scripts/settings/ci_cd and register with

    env CI_SERVER_URL=https://gitlab.suse.de RUNNER_NAME=lord.arch RUNNER_EXECUTOR=docker DOCKER_IMAGE=mini/base REGISTRATION_TOKEN=XXX REGISTER_NON_INTERACTIVE=true /usr/sbin/gitlab-runner register

    • Enable docker and gitlab service for i in docker gitlab-runner; do systemctl enable --now $i ; done
    • Enabled auto-devops on okurz/scripts to try out

Similar Projects

This project is one of its kind!