The supportconfig tool is a great resource for troubleshooting common system issues on SLES but its functionalities might not be enough to troubleshoot other issues related to cloud solutions. I would like to invite you to contribute on this project by creating new plugins/tools to complement supportconfig's great power and ease the troubleshooting process for SUSE Openstack Cloud product.
This project will be considered as "successful" if we are able to develop and include on the main supportconfig tool, the new features listed below:
Develop some sort of "hb_report" tool for cloud where these could be included:
- Structure the information collected in a better directory structure (directories and subdirectories instead of a huge unique file containing everything). We have some "splitter" tools, which recreate the original directory structure on the server (scsplitter.py) but it would be interesting to make this split structure the default one.
- Include a way to "Trim" or "Toggle" the supportconfig to get the information relevant only to errors that occurred on specific components or dates. This way we would avoid having huge files containing data we don't necessarily need. The idea is to have a nice and easy way how to filter information - by instance id, request id, timestamp or any other attribute added to the "supportconfig" command
- Include commands like "openstack (...) list" and "openstack (...) show $id"
- HA-specific checks (pacemaker and pacemaker-remote if any)
- Services report (up or on error state) - checking status from openstack command, from systemctl status and resource status in cluster; I had a case where a neutron agent(if I remember correctly) was in down ":-(" status while systemctl and crm_mon reported service is up and running
- Database dump
- Switch selected component to debug mode and collects logs from customer actions
- Collect storage background and configuration
- Query API's and generate a report on the activities/request
- Ping endpoints and resolve hostnames as a check
- Adding /var/lib/neutron to supportconfig (Bogdano in Rocket Chat)
A tree-like graphical tool (or ASCII art) that shows the complete infrastructure and allows to break each node by component/service then to review config/logs
Getting info from supportconfig as part of "Best Practice" document.
Compare Versions: Versions in support config against current versions in the SCC repos
Currently identified tools which could be included:
SOSREPORT: https://github.com/sosreport/sos: Sos is an extensible, portable, support data collection tool primarily aimed at Linux distributions and other UNIX-like operating systems. Perhaps consider a well-established tool with plugins for every possible situation before implementing our own bicycle
ELK Tool: https://github.com/denisok/elk_supportconfig
Support Config Utils from A. Spiers: https://build.opensuse.org/package/show/home:aspiers/supportconfig-utils
Crowbar Macs: https://github.com/aspiers/SUSE-dist/blob/master/bin/crowbar-macs
scsplitter (no link known)
lnav monitoring: https://software.opensuse.org/download.html?project=server:monitoring&package=lnav
Looking for mad skills in:
This project is part of:
Hack Week 17