Introduction

TensorFlow™ is an open-source software library for Machine Intelligence written on Python. It was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. (https://www.tensorflow.org/)

Using values recorded by SUSE Manager it should be possible to predict the outcome of certain operations if machine learning is applied. We are especially interested in the time it takes to apply patches to systems. With anecdotal values a neural network should be trained to predict this for future operations. We need do find out which values can and should be provided, which classifier(s) to use, aso.

Goals:

  • Monday:

    • Learn about Tensorflow: Definitions, how to create a model, different frameworks, etc
    • Define set of features that can be gathered from the SUSE Manager DB to create our dataset.
    • Explore the values of the dataset: Know about min-max values, boundaries, type of data (categorical, continuous).
    • Define crossed relation between data (crossed columns).
    • Is our dataset good enough?
  • Tuesday:

    • Create and test different tensorflow models: DNNCombinedLinearClassifier, DNNClassifier, etc
    • Are those models' estimations good enough?
    • Is tensorflow suitable for achiving the project goal? are estimation good enough for us?
    • Upload working example.

Outcomes:

  • Initial dataset was not really good. We modified the SQL query to collect also package ids.
  • In the past we restricted the dataset to only contain actions for erratas which only contains one package, but the resulting dataset was not big enough.
  • We implemented a DNNRegressor.
  • Dataset: COLUMNS = ["server_id","errata_id","nrcpu","mhz","ram","package_id","size","time"] (we only currently use server_id, errata_id, package_id)
  • Currently the dataset is based patch installation actions which contains only a one single errata but this errata can have multiple packages associated.
  • We don't know the installation time for a package, because the "time" data we have is for the complete action, so we do a very draft estimation just dividing the total time by the number of packages the errata contains.
  • Estimations seems to be good enough, of course, the database still needs to be improved as well as the model itself where the feature columns definition can be adjusted to get better results.
  • Current estimations are good enough to, at least, give an estimation saying if the action you're planning is going to take less than ~10 seconds, ~30 seconds, ~1 minute, ~5 minutes, etc.

Some samples of estimations:

expected -> estimated

0.233874837557475 -> 0.230502188205719
0.233874837557475 -> 0.25423765182495117
0.233874837557475 -> 0.1823016107082367
0.979458148662861 -> 0.8299890756607056
0.979458148662861 -> 0.8462812900543213
0.211660345395406 -> 0.22346541285514832
1.70577935377757 -> 1.9606330394744873
2.60000002384186 -> 2.39455509185791
0.976182460784912 -> 0.1866598129272461
0.976182460784912 -> 0.614652693271637
2.80241966247559 -> 1.0975050926208496
0.6621074676513671 -> 0.6865990161895752
0.0968895809991019 -> 0.041620612144470215
0.0968895809991019 -> 0.1236574649810791
0.0968895809991019 -> 0.05707252025604248
1.3669094741344499 -> 2.2393956184387207
1.3669094741344499 -> 2.2393956184387207

"Actual" vs "Predicted" screenshots:

Screenshot1

Full graph: view full graph here

Next steps:

  • Refinement of model and dataset
  • Add actions with multiple errata to the dataset
  • Implement also a DNNClassifier to directly classifing instead of getting a float number (possible classes: seconds, minutes, hours).
  • POC of integration with the SUSE Manager UI
  • Refeed the neural network with the actual results of the new actions on SUSE Manager.
  • Replace package_id with something consistent across customers (eg: package name)
  • Try to find a way to avoid averaging the time per package on erratas that point to multiple packages
  • Estimate the actual action (not per package)

Code repository: Internal GitLab

Looking for mad skills in:

tensorflow python machinelearning susemanager

This project is part of:

Hack Week 16

Activity

  • about 1 year ago: bfilho liked Learning & using Tensorflow to estimate patch installation times on SUSE Manager
  • almost 2 years ago: Johannes Renner liked Learning & using Tensorflow to estimate patch installation times on SUSE Manager
  • almost 2 years ago: PSuarezHernandez added keyword "susemanager" to Learning & using Tensorflow to estimate patch installation times on SUSE Manager
  • almost 2 years ago: PSuarezHernandez added keyword "machinelearning" to Learning & using Tensorflow to estimate patch installation times on SUSE Manager
  • almost 2 years ago: PSuarezHernandez added keyword "python" to Learning & using Tensorflow to estimate patch installation times on SUSE Manager
  • Show History

    Comments

    • PSuarezHernandez
      almost 2 years ago by PSuarezHernandez | Reply

      The outcomes from this HW project has been published!! The project page has been updated to include the results!

    Similar Projects

    Machine Learning on bugzilla by mslacken

    Goals

    • get used to some of this ugly bu...


    Architecting a Machine Learning project with SUSE CaaSP by jordimassaguerpla

    The goal of this project is to get an overview ...


    Run and manage your Ansible cluster using Salt! by PSuarezHernandez

    At SUSE we've implemented a module on Salt call...


    Finish packaging Angr in OBS by a_faerber

    Following a FOSDEM presentation on [Angr](http:...


    Make "salt-toaster" available to be used outside SUSE by PSuarezHernandez

    The salt-toaster (https://github.com/openSUSE...


    Evaluate mirrormanager (or mirrormanager2) for download.opensuse.org mirror management by lrupp

    As there is no progress around [MIrrorPinky](ht...


    Uyuni: improve spacewalk-repo-sync performance by moio

    Let's make reposync faster

    Every day,

    ...


    Architecting a Machine Learning project with SUSE CaaSP by jordimassaguerpla

    The goal of this project is to get an overview ...


    ML and AI for code static analysis by mvarlese

    The idea is to explore the technologies and the...


    terracumber: python replacement for sumaform-test-runner by juliogonzalezgil

    At SUSE Manager and Uyuni we use right now a se...


    Suse Manager - SPA by LuNeves

    The experience while navigating throughout the ...