Investigate possibilities for the distributed builds for Ceph to speed up builds.

This task could have 2 scopes.

  • replace build and vstart for developers to run in containers for any target base (openSUSE, Ubuntu and etc)

  • distribute build jobs across nodes, probably on k8s cluster

There are some projects already to distribute the build:

distcc

icecream

As the first step maybe get to know already existent system and how much they speed up the build to have a reference point.

Idea is to build container (and reuse it later) with all dependencies (./install-deps.sh on some base for devs, or chroot before build phase for osc/obs) and start number of build jobs on k8s cluster (or podman if local dev env).

That could help: * developers to build, test their changes on any dist base, locally or on k8s * speed up builds for IBS/OBS to distribute build on k8s cluster

For dev environment buildah could be used to get base OS and actually run ./install-deps.sh and get current base for the build. For osc/isc, that tool already prepares chroot, so that might be consumed as container base.

Some tool needs to be used/developed to generate k8s manifests to run build based on that container, run those manifests locally with podman play kube or remotely scheduling those manifest on k8s cluster and some how gather the results from the jobs and store resulted binaries/rpms/containers somewhere to run/store them.

Looking for hackers with the skills:

icecream distcc ceph osc obs

This project is part of:

Hack Week 19

Activity

  • 5 months ago: rgrigorev started distributed build for Ceph in containers
  • 5 months ago: tbechtold liked distributed build for Ceph in containers
  • 5 months ago: denisok added keyword "icecream" to distributed build for Ceph in containers
  • 5 months ago: denisok added keyword "distcc" to distributed build for Ceph in containers
  • 5 months ago: denisok added keyword "ceph" to distributed build for Ceph in containers
  • 5 months ago: denisok added keyword "osc" to distributed build for Ceph in containers
  • 5 months ago: denisok added keyword "obs" to distributed build for Ceph in containers
  • 5 months ago: denisok originated distributed build for Ceph in containers

  • Comments

    • rgrigorev
      5 months ago by rgrigorev | Reply

      I did some measurements on our node: Main build node + icecream manage ses-client-6 main storage device /dev/nvme0n1p3 Drive speed: 'Timing buffered disk reads: 5388 MB in 3.00 seconds = 1795.90 MB/sec' CPU Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz/ 16Threads/ https://ark.intel.com/content/www/us/en/ark/products/123547/intel-xeon-silver-4110-processor-11m-cache-2-10-ghz.html 64G RAM minion nodes ses-client-7 and ses-client-8

      description real user sys 1 node 210m38.879s 202m4.993s 12m45.631s 1 node -j 6 47m44.742s 254m36.362s 15m27.561s 1 node -j 11 34m38.795s 318m48.671s 17m22.380s 2 nodes -j 12 26m44.379s 156m59.053s 11m44.222s 3 nodes -j 18 25m4.973s 63m48.105s 7m53.588s 3 nodes -j 33 17m27.299s 55m18.695s 7m38.526s 3 nodes -j 33, 25G network 17m0.235s 46m24.425s 7m13.615s .... + ram disk 17m2.661s 47m0.784s 7m11.908s .... + fix in boost 13m22.895s 45m16.003s 4m55.285s

    • rgrigorev
      5 months ago by rgrigorev | Reply

      I did some measurements on our node:

      main build node + icecream manage ses-client-6
      main storage device /dev/nvme0n1p3
      Drive speed: 'Timing buffered disk reads: 5388 MB in 3.00 seconds = 1795.90 MB/sec'
      CPU Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz/ 16Threads/ full cpu description
      64G RAM
      minion nodes ses-client-7 and ses-client-8

      description ________ real __________ user __________ sys
      1 node ______________ 210m38.879s __ 202m4.993s ___ 12m45.631s
      1 node -j 6 ___________ 47m44.742s ___ 254m36.362s __ 15m27.561s
      1 node -j 11 __________ 34m38.795s ___ 318m48.671s __ 17m22.380s
      2 nodes -j 12 _________ 26m44.379s ___ 156m59.053s __ 11m44.222s
      3 nodes -j 18 _________ 25m4.973s ____ 63m48.105s ___ 7m53.588s
      3 nodes -j 33 _________ 17m27.299s ___ 55m18.695s ___ 7m38.526s
      3 nodes -j 33, 25G net _ 17m0.235s _____ 46m24.425s ___ 7m13.615s
      .... + ram disk ________ 17m2.661s _____ 47m0.784s ____ 7m11.908s
      .... + fix in boost ______ 13m22.895s ____ 45m16.003s ___ 4m55.285s

    • rpenyaev
      5 months ago by rpenyaev | Reply

      Roman, can the following kernel patch also speedup our ceph builds? Seems worth to try

      https://www.phoronix.com/scan.php?page=news_item&px=Linux-Pipe-Parallel-Job-Opt https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0ddad21d3e99c743a3aa473121dc5561679e26bb

    • rgrigorev
      5 months ago by rgrigorev | Reply

      3 nodes -j 33 (~11 per node), hsm network, ram disk, manual fixes in boost files + kernel 5.6.7 pre

      real 13m22.895s user 45m16.003s sys 4m55.285s yes, there is also some improvement

      • rgrigorev
        5 months ago by rgrigorev | Reply

        -j 11, kernel 5.6.0-rc1-197.29 (include new patch from Linux) real 31m13.175s user 321m5.955s sys 16m13.355s

        baseline -j 11 real 34m38.795s user 318m48.671s sys 17m22.380s

    Similar Projects

    Ceph as a ephemeral storage for containers by denisok

    The idea here is to study and understand how ep...


    Migrate more OBS service scripts to pure systemd by enavarro_suse

    Following the work started in the last hackweek...


    Modernize Mash deployment by seanmarlow

    Mash is a Python based CI/CD pipeline for aut...


    OBS Project Monitor page redesign by vpereirabr

    Exactly what problem will this solve?

    Th...


    MicroOS Desktop by RBrownSUSE

    [Video Recording of openSUSE Conference sessio...