Git Product home page Git Product logo

sos's Introduction

Build Status Documentation Status sosreport

SoS

Sos is an extensible, portable, support data collection tool primarily aimed at Linux distributions and other UNIX-like operating systems.

This project is hosted at:

For the latest version, to contribute, and for more information, please visit the project pages or join the mailing list.

To clone the current main (development) branch run:

git clone git://github.com/sosreport/sos.git

Reporting bugs

Please report bugs via the mailing list or by opening an issue in the GitHub Issue Tracker

Chat

The SoS project has rooms in Matrix and in Libera.Chat.

Matrix Room: #sosreport:matrix.org

Libera.Chat: #sos

These rooms are bridged, so joining either is sufficient as messages from either will appear in both.

The Freenode #sos room is no longer used by this project.

Mailing list

The sos-devel list is the mailing list for any sos-related questions and discussion. Patch submissions and reviews are welcome too.

Patches and pull requests

Patches can be submitted via the mailing list or as GitHub pull requests. If using GitHub please make sure your branch applies to the current main branch as a 'fast forward' merge (i.e. without creating a merge commit). Use the git rebase command to update your branch to the current main if necessary.

Please refer to the contributor guidelines for guidance on formatting patches and commit messages.

Before sending a pull request, it is advisable to check your contribution against the flake8 linter, the unit tests, and the stage one avocado test suite:

# from within the git checkout
$ flake8 sos
$ nosetests -v tests/unittests/

# as root
# PYTHONPATH=tests/ avocado run --test-runner=runner -t stageone tests/{cleaner,collect,report,vendor}_tests

Note that the avocado test suite will generate and remove several reports over its execution, but no changes will be made to your local system.

All contributions must pass the entire test suite before being accepted.

Documentation

User and API documentation is automatically generated using Sphinx and Read the Docs.

To generate HTML documents locally, install dependencies using

pip install -r requirements.txt

and run

sphinx-build -b html docs <destination dir> 

Wiki

For more in-depth information on the project's features and functionality, please see the GitHub wiki.

If you are interested in contributing an entirely new plugin, or extending sos to support your distribution of choice, please see these wiki pages:

To help get your changes merged quickly with as few revisions as possible please refer to the Contributor Guidelines when submitting patches or pull requests.

Installation

Manual Installation

You can simply run from the git checkout now:

$ sudo ./bin/sos report 

The command sosreport is still available, as a legacy redirector, and can be used like this:

$ sudo ./bin/sosreport 

To see a list of all available plugins and plugin options, run

$ sudo ./bin/sos report -l

To install locally (as root):

# python3 setup.py install

Pre-built Packaging

Fedora/RHEL users install via yum:

# yum install sos

Debian users install via apt:

# apt install sosreport

Ubuntu (14.04 LTS and above) users install via apt:

# sudo apt install sosreport

Snap Installation

# snap install sosreport --classic

sos's People

Contributors

apconole avatar arif-ali avatar bmr-cymru avatar bryanquigley avatar codificat avatar danalsan avatar dnegreira avatar fleitner avatar jcastill avatar jhjaggars avatar jjansky1 avatar kevintraynor avatar lyarwood avatar mfoliveira avatar mikelolasagasti avatar nkshirsagar avatar npinaeva avatar obnoxxx avatar pacevedom avatar pierg75 avatar pmoravec avatar pponnuvel avatar rmetrich avatar sandrobonazzola avatar sbradley7777 avatar sourabhjains avatar stuggi avatar trevorbenson avatar turboturtle avatar utopiabound avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sos's Issues

feature: ability to run sosreport tests self contained

This may or may not be feasible but during my build tests I was getting errors wrt to the tests because I did not have sosreport previously installed.

I was thinking that maybe running these tests in a self contained environment like with virtualenv or something in order to make sure those tests do run successfully whether or not sos is installed.

Otherwise we'd have to limit what unit and functional testing can be done on build servers that do not contain sos installed prior to building.

Or maybe look into adding the sos directory from the build environment to sys.path to make sure testing doesn't fail on locating sos packages.

Add default setup() method

Today we provide only an empty setup() in the generic Plugin class:

    def setup(self):
        """This method must be overridden to add the copyPaths, forbiddenPaths,
        and external programs to be collected at a minimum.
        """
        pass

This is overridden in all concrete plugin classes to call addCopy* collectExt* etc.

We already have self.files in the class as a way for modules to specify a list of files to check for in the generic checkenabled(). By adding a default setup() that just calls addCopySpecs(list(self.files)) we can move at least some plugins to a purely declarative format (so they just initialise variables but do not provide any executable code of their own). This simplifies those plugins and helps get rid of lots of slightly varying code in the plugins directory.

extracting tarballs with SELinux context for /proc and /sys spews errors

tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/percpu_pagelist_fraction: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/scan_unevictable_pages: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/stat_interval: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/swappiness: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/vfs_cache_pressure: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm/zone_reclaim_mode: Cannot setfilecon: Permission denied
tar: sosreport-rhel7-vm1-20121210145629/proc/sys/vm: Cannot setfilecon: Permission denied

This is because we can't set those contexts on "real" file systems when unpacking the tarball.

This "fixes" it:

# git diff
diff --git a/sos/utilities.py b/sos/utilities.py
index b424e70..a1f79d1 100644
--- a/sos/utilities.py
+++ b/sos/utilities.py
@@ -263,9 +263,12 @@ class TarFileArchive(Archive):
             tar_info.size = len(content)
             fileobj = StringIO(content)
         fstat = os.stat(src)
-        context = self.get_selinux_context(src)
-        if context:
-            tar_info.pax_headers['RHT.security.selinux'] = context
+        if src.startswith("/sys/") or src.startswith ("/proc/"):
+            context = None
+        else:
+            context = self.get_selinux_context(src)
+            if context:
+                tar_info.pax_headers['RHT.security.selinux'] = context
         self.set_tar_info_from_stat(tar_info,fstat)
         self.add_parent(src)
         self.tarfile.addfile(tar_info, fileobj)

But it's a bit kludgy - otoh my other idea of adding a "nocontext" param means all modules must know about what paths should/shouldn't include context.

dmraid.ddf1 directory

When I run sosreport --batch I have created directory dmraid.ddf1 in a run place.
But no dmraid.ddf1 direcory in created archive in /tmp/sosreport-*.tar.xz
I have redhat 6.2 and sosreport 2.2
For example:
[root@iwdb ~]# mkdir sos
[root@iwdb ~]# cd sos
[root@iwdb sos]# ls -la
итого 8
drwxr-xr-x 2 root root 4096 Авг 11 15:00 .
dr-xr-x---. 13 root root 4096 Авг 11 15:00 ..
[root@iwdb sos]# sosreport --batch

sosreport (version 2.2)

This utility will collect some detailed information about the
hardware and setup of your Red Hat Enterprise Linux system.
The information is collected and an archive is packaged under
/tmp, which you can send to a support representative.
Red Hat Enterprise Linux will use this information for diagnostic purposes ONLY
and it will be considered confidential information.

This process may take a while to complete.
No changes will be made to your system.

Выполняются модули. Пожалуйста, подождите...

Completed [51/51] ...
Создаётся архив...

Созданный отчёт сохранен в:
/tmp/sosreport-iwdb-20120811150056-6e52.tar.xz

The md5sum is: a45d945d7d6cd730e5f06aad05796e52

Отправьте этот файл представителю службы поддержки.

[root@iwdb sos]# ls -la
итого 12
drwxr-xr-x 3 root root 4096 Авг 11 15:00 .
dr-xr-x---. 13 root root 4096 Авг 11 15:00 ..
drwxr-xr-x 2 root root 4096 Авг 11 15:00 dmraid.ddf1

Reporting review

Reporting seems to be in a funny state at the moment. We have the old HTML and XML reporting code (the XML stuff seems to be dead right now, or at least, does not run when --report is given). The legacy HTML stuff works, just about, but is ugly and a maintenance headache.

The new Report class is pretty cool and gives /much/ cleaner looking code but only implements PlainTextReport as a concrete class.

I'm also wondering if we shouldn't just turn reporting on by default (and invert --report -> --no-report) since it seems to take up very little runtime.

ability to abort/kill sub-processes that hang for too long

from RFE template ...

  1. What is the nature and description of the request?
    Customer would like sosreport to be able to catch when it is hung and kill the
    process that is hung and issue a message of where the report hangs. This would
    enable the sosreport to complete and provide information on what is causing the
    problem.
  2. Why does the customer need this? (List the business requirements here)
    Customer has many systems that failed running sosreport and it stopped
    progress on multiple cases while they attempted to find why the sosreport would
    not complete. It was found to be multiple issues(case depending) where the
    sosreport hung and did not produce the required information to force creation
    of sosreport tar.
  3. How would the customer like to achieve this? (List the functional
    requirements here)
    Customer would like sosreport to have the ability to kill the particular
    process or script hanging once it becomes hung for some period of time and
    provide a message indicating where the process hung to allow ease of
    troubleshooting and completion of sos so that a fix or workaround can be
    issued.
  4. For each functional requirement listed in question 4, specify how Red Hat
    and the customer can test to confirm the requirement is successfully
    implemented.
    This may be tested by implementing some bad script in the sosreport(in startup
    for example) that would hang the process. Once process is hung for a period of
    time it should cancel the script and provide message of where the problem lies
    and still provide the rest of the details that sos was able to capture.
  5. Is there already an existing RFE upstream or in Red Hat bugzilla?
    I was able to find a bugzilla that appears to have a similar issue that this
    request may resolve. Where a dry run would cover what will be run, what this
    customer requests is for sos to be more useful in finding a way to tell whwhere
    sos is hanging the system(in case you are not aware a problem exists). This
    will help the customer to work around the hang and assist in getting the
    problem fixed more quickly:
    https://bugzilla.redhat.com/show_bug.cgi?id=507394
  6. How quickly does this need resolved? (desired target release)
    The customer would like this added soon, however the problem that led to this
    request is currently resolved. The customer had a problem in the startup
    scripts of sosreport that was preventing completion of sos. As a result it
    slowed resolution to several problems that required gathering information piece
    by piece, which could have been more quickly provided with sos. The customer
    was able to use "-n startup" option with sos once we found the problem was
    caused by a startup script. The customer would like this enhancement added to
    allow for solutions and workarounds such as this to be available more quickly
    in the future should this occur again.
  7. Does this request meet the RHEL Inclusion criteria? (please review)
    Yes. This fits into minor revision for updates within the inclusion criteria.

Handle unreadable files that have read permissions better

Configurations with cgroup controllers get a lot of this:

Unable to copy /sys/fs/cgroup/memory/memory.memsw.failcnt to /sys/fs/cgroup/memory/memory.memsw.failcnt
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/sos/plugins/__init__.py", line 350, in doCopyFileOrDir
    self.archive.add_file(srcpath, dest)
  File "/usr/lib/python2.7/site-packages/sos/utilities.py", line 265, in add_file
    content = fp.read()
IOError: [Errno 95] Operation not supported

These files have read permissions but no read implementation so they return EOPNOTSUPP.

Adding / to tarball with 555 perms is bad, mkay?

# tar xf sosreport-rhel7-vm1-20121212152507.tar.xz
# ll -d sosreport-rhel7-vm1-20121212152507
dr-xr-xr-x. 6 root root    140 Nov 23 19:04 sosreport-rhel7-vm1-20121212152507

Attempting to unpack something like this as a non-root user gives:

tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs/cgroup: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs/cgroup/cpu,cpuacct: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs/cgroup/cpu,cpuacct/system: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs/cgroup/cpu,cpuacct/system/cups.service: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs/cgroup/cpu,cpuacct/system/cups.service/cpuacct.usage_percpu: Cannot open: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys/fs: Cannot mkdir: No such file or directory
tar: sosreport-hex.usersys.redhat.com-20121211231358/sys: Cannot mkdir: Permission denied

(i.e. 911,000!)

We can't break non-privileged access to sos data for obvious reasons.

This was broken by commit 179d9bb

import_plugin needs better debugging

Currently errors caught in import_plugin are propagated as a None, leading to a bland "TypeError: 'NoneType' object is not iterable" from sosreport.py:load_plugins().

I think the simplest fix is to just not handle the exception at all in that function: just pass it up to load_plugins() which already has exception handling controlled via the --debug switch.

sos plugins should have access to the files in their final destination

Sometimes files need to be edited before being packaged up. The regular expression feature is the only method to do this now and sometimes it is far to limited to do the sorts of things some plugins need to do. For example, the JBOSS plugin removes passwords from configuration files that are stored as XML documents. Using a real XML parser would be a better tool than passing the document text through some regular expressions.

If there was a way to get direct access to the file and change it only in the archived version that would allow the use of other file editing tools.

Sosreport should try to create an archive in almost every situation

An issue was just resolved in the AS7 project that was tough to track down because the archive was not being created, and thus logfiles were not being created.

I think that sosreport should always try to write an archive even in total failure that contains whatever has been logged thus far. An exception to this rule will likely be for the -l option. Though this can fail in non obvious ways and a log could be useful in that situation as well.

In addition to this I think that we may need to tweak logging to include more information by default, at least to files.

Package checks are completely borked

I noticed something funny going on with the postgresql module in testing. It only appeared to run when given -opostgresql despite having a packages list ("postgresql",) and that package being installed.

Adding some debug to the generic checkenabled() shows that isInstalled() never returns True:

checking packages:
('openssl',)
Is installed(openssl): False
checking packages:
('postgresql',)
Is installed(postgresql): False
checking packages:
['sanlock']
Is installed(sanlock): False

This might explain why in my testing I seemed to see a much smaller set of modules execute by default than I'd expect - I was putting it down to testing on a relatively minimal install (~418 packages) but I think this may be the real cause - it looks like we might only be triggering modules that also include a flies list (e.g. sanlock above has one and runs, postgresql doesn't and doesn't...).

Make plugins available and listable for current running distro

To reduce complexity in deciding what plugins are shown and for common user to choose the plugin, I feel that showing only plugins that are related to distro but still give the option to manually run some of the non distro specific plugins.

Some thoughts I've been contemplating are

sos/plugins/common.py
sos/plugins/debian
sos/plugins/redhat
sos/plugins/ubuntu

Or within each plugin have a build define to online make certain code available based on distro. Similar to ifdef windows

as7 plugin doesn't handle unicode correctly

2012-10-02 16:19:34,140 ERROR: as7
Traceback (most recent call last):
  File "/jboss/jbosseap6/jboss-eap-6.0/modules/org/jboss/as/jdr/main/jboss-as-sos-7.1.2.Final-redhat-1.jar/sos/sosreport.py", line 657, in setup
    plug.setup()
  File "/jboss/jbosseap6/jboss-eap-6.0/modules/org/jboss/as/jdr/main/jboss-as-sos-7.1.2.Final-redhat-1.jar/sos/plugins/as7.py", line 221, in setup
    self.__getStdJarInfo()
  File "/jboss/jbosseap6/jboss-eap-6.0/modules/org/jboss/as/jdr/main/jboss-as-sos-7.1.2.Final-redhat-1.jar/sos/plugins/as7.py", line 98, in _AS7__getStdJarInfo
    "%s\n%s\n%s\n===\n" % (name, checksum, manifest)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 638: ordinal not in range(128)

Push globbing down into doRegexSub()

Unlike addCopySpec*() our regex substitution routine (mostly used for post-processing edits, e.g. password elision) does not accept globs.

At least the following modules are implementing something like this themselves today:

cluster.py
libvirt.py
as7.py
jboss.py

What about pushing at least the glob handling down? Some of these use a find style of op so maybe a more general interface?

plugin options should be able to accept lists

Options passed to plugins are parsed by the wrong option parser. Therefore lists of option parameters such as --myplugin.list_opt=foo,bar,baz will result in only "foo" being available.

Using the "append" option parser should fix this.

Class dispatcher based on distro specific plugins

Hey there

I've got some work going on to have sosreport running on Ubuntu boxes. One problem I'm seeing is even though we have 'RedHatPlugin, IndenpendentPlugin' and for my uses 'UbuntuPlugin' there is still not a decent way of deciding which commands should or shouldnt be run depending on the distro.

My suggestion (and I am going to work on this) is to have some sort of class dispatcher for the plugins. For example, in the general plugin we could have:

class general(Plugin, IndependentPlugin):
def setup(self):
self.collectoutput("some non specific data")

class generalRHEL(Plugin, RedHatPlugin):
def setup(self):
self.collectoutput("some rhel specific like RHN")

class generalUbuntu(Plugin, UbuntuPlugin):
def setup(self):
self.collectoutput("some ubuntu/debian specifics")

In our plugin class dispatcher we would do some sort of mapping like:

{'IndependentPlugin' : '%s' % (plugname,),
'RedHatPlugin': '%s%s' % (plugname, DistroPluginSpecific)}

This way we aren't littering 1 class with data to collect that may not pertain to the underlying distro.

What do you guys think?

Thanks
Adam

No API for performing substitution on collected command output

We currently lack an interface to allow regex substitutions on the output collected from external programs (a-la doRegexSub() for files added with addCopySpec*()).

There is a need to support this now since corosync's objdump -a will reveal fence agent passwords:

[...]
cluster.fencedevices.fencedevice.name=CLUSTERNODENAME2-ipmi
cluster.fencedevices.fencedevice.passwd=PASSWORD
[...]

I've added a first cut on branch bmr-ext-cmds-postproc. I'm not that thrilled with it but it does seem to work. It adds a new call, doRegexExtCommandSub() (maybe doExtCommandRegexSub() is better..?) with the interface:

def doRegexExtOutputSub(self, cmd, regexp, subst):
    '''Apply a regexp substitution to command output archived by sosreport.
    cmd is the command name from which output is collected (i.e. excluding
    parameters). The regexp can be a string or a compiled re object. The
    substitution string, subst, is a string that replaces each occurrence
    of regexp in each file collected from cmd. Internally 'cmd' is treated
    as a glob with a trailing '*' and each matching file from the current
    module's command list is subjected to the replacement.

    This function returns the number of replacements made.
    '''

which takes a command name (basename) to substitute (cmd). This causes each file in the archive matching "sos_commands/self.name()/cmd" to be subjected to the supplied regex substitution. I.e. if the cmd passed was "ssh" then files ssh_-v, openssh_-v and ssh-agent_-v would all match. This is kinda sloppy but I think works for the vast majority of cases that I can think of. We could tighten this up if desired.

Generally I think this interface is a bit clunky and I was also considering whether it's better to add this in-line to the command output collection routines. That would mean some wide-ranging interface changes but might be a better API to use.

Would really value some feedback on this one - it's important we have the ability to elide passwords and other sensitive information in the data we collect so it's going to be hard to dodge the need for this.

Set sane permissions for sos directories

When building a tree in /tmp files get umasked to the user's current setting. With tarfile they have to be set explicitly:

drwxrwxrwx. 26 breeves breeves 4096 Dec 12 23:37 sos_commands
drwxrwxrwx. 2 breeves breeves 4096 Dec 12 23:37 sos_logs
drwxrwxrwx. 2 breeves breeves 4096 Dec 12 23:37 sos_reports

It's a bad idea to create world-writable files on the host where the report is unpacked.

When copying directory into report using addCopySpec, links inside are not handled correctly

Description of problem:

It seems to be a general problem, which shows eg in general plugin:

  • there is addCopySpec("/etc/sysconfig") statement
  • /etc/sysconfig/selinux is a link to /etc/selinux/config
  • because of this bug, only link is copied, resulting in missing real file and
    having broken link in report

Version-Release number of selected component (if applicable):

sos-1.7-9.54.el5

How reproducible:

Always

Steps to Reproduce:

  1. sosreport -o general
  2. unpack report and list etc/sysconfig/selinux

Actual results:

link is broken, destination file not present in report

Expected results:

destination file should be gathered, too

Additional info:

Improve --profile coverage

The profile logging is useful for pinning down changes that increase the runtime of sosreport. This was heavily used to tune performance up to the 2.2 release (dramatic improvements compared to 1.x - in many cases dropping from 5-10m to <1m).

Previous causes of long runtimes have generally been:

  • long-running external programs (e.g. rpm -Va)
  • very large log file collections
  • inefficient data structure use

Our runtime has grown quite considerably since 2.2 (although still not exceeding 1m on typical runs for a lightly-configured host - I see a consistent average of ~35s with the default plugin set on my VMs) but I'm concerned that we might see bigger differences on some setups.

Right now the profile log only captures intervals for command and copy execution:

output: /bin/mount -l time: 0.011265
output: /sbin/lsmod time: 0.030516
output: /sbin/ip -o addr time: 0.007033
copied: /root/anaconda-ks.cfg time: 0.001559
copied: /var/log/anaconda/anaconda.log time: 0.001696
copied: /var/log/anaconda/syslog time: 0.001768

Other "interesting" parts of the run (e.g. policy and package manager initiatlisation, tarball generation and compression) are not covered.

Package API regression

The old policy object package query API leaked abstractions into plugin code since it returns RPM header objects from the rpm-python modules.

This was cleaned up during the major updates to sos but has some problems. Currently when a package is queried (via pkgByName(), allPkgsByName(), allPkgsByNameRegex()) the return value is a simple string (or generated array of strings) giving the package name and no other details.

This makes it impossible for modules that want to do package version checks to work; currently this breaks the redhat policy's rhelVersion() and the Gluster module's gluster package version check.

I don't want to lose the improved abstraction of the PM object but I think we need to do better in terms of the information we make available. I'll try a first cut of returning our own dictionary objects that mimic the RPM header names (we need a standard to follow - at least for some core set of fields).

For now I intend to leave the shell-out rpm -qa machinery in place but I'm seeing considerable runtime increases compared to sos-2.2's rpm-python based code so this may need to be revised as well at some point.

Links in HTML reports are broken

Links in html reports have a couple of problems, e.g.:

<li><a href="/etc/audit/auditd.conf">/etc/audit/auditd.conf</a></li>
<li><a href="sos_commands/yum/yum_-C_repolist">/usr/bin/yum -C repolist</a></li>

Need to fix this to use the relative path in the href as in 2.2:

<li><a href="../proc/filesystems">/proc/filesystems</a></li>
<li><a href="../sos_commands/filesys/mount_-l">/bin/mount -l</a></li>

feat req: possibly use cgitb for better debug tracing?

Just throwing this out there.. I've been using cgitb to capture a more in-depth traceback for debugging python issues and it works really well. It's part of the python stdlib as of 2.2 and I don't believe there is much to do in order to integrate. As far as I can tell it is a matter of altering the sys.excepthook callback.

Thoughts?

Deal with CopySpecs that include kernel file system directories better

Currently scooping up a whole directory tree from /sys or /proc can trigger lots of ugly errors in -v mode. There's also an associated problem of triggering dmesg warnings when we touch deprecated files (e.g. several ipv6 sysctls).

There's a few ways we could deal with this:

  • Check read permissions, skip files with --w--w--w-.and truncate/chmod instead
  • Add an additional parameter to suppress all IO errors
  • Add some kind of path black/whitelisting

The last is probably a more maintainable way to cope with these problems long-term but in the short term the others are much easier to implement with our current structure.

sosreport French translation of y/n prompt is wrong and confusing

Unfortunately this is not fixed, really:

# rpm -q sos`
sos-1.7-9.54.el5
# LANG=fr sosreport -o cluster

sosreport (version 1.7)

This utility will collect some detailed  information about the
hardware and  setup of your  Red Hat Enterprise Linux  system.
The information is collected and an archive is  packaged under
/tmp, which you can send to a support representative.
Red Hat will use this information for diagnostic purposes ONLY
and it will be considered confidential information.

This process may take a while to complete.
No changes will be made to your system.

Press ENTER to continue, or CTRL-C to quit.

Un ou plusieurs plugins ont détecté un problème avec votre configuration.
Prière de vérifier les messages suivants:

cluster:
    * required module is not loaded: dlm
    * service cman is not running
    * service cman is not started in default runlevel
    * service rgmanager is not running
    * service rgmanager is not started in default runlevel
    * cluster node is not quorate
    * one or more nodes have manual fencing agent configured (data integrity is
not guaranteed)
    * one or more nfs export do not have a fsid attribute set.

Voulez vous continuer (y/n)?y
Voulez vous continuer (y/n)?o

Prière d'entrer votre première initiale et votre nom [hp-rx2660-03]: 

It is wrong in sos.po in sos-1.7.tar.gz and not fixed by any patch.

ability to fileGrep multiple files

In order to reduce littering duplicate plugins for different distro's I think it would be beneficial to have a routine that supports multiple files in the fileGrep or a separate routine.

For example, autofs plugin. On RHEL the system config file is /etc/sysconfig/autofs, whereas, Debian/Ubuntu its /etc/default/autofs5. It would be nice to pass in a list of files to regex for if debugging is enabled.

doRegexSub doesn't replace the content in the archive properly

As a product of moving to a spooled archive rather than a tmp directory for building sosreports the doRegexSub function was modified to work with files already in the archive.

The new implementation does not actually replace the file to modify, it makes a copy in the sos_strings directory with the replacements made. This is not the old behavior.

This issue is mostly a note for me. Ideally I'd like to be able to restore the old behavior. But the implementation isn't quite as simple as just modifying a file on disk.

Add a 'services' member to Plugin?

We already have "files" and "packages" but some modules implement runlevel checks to gather service information, e.g. sunrpc:

    def checkenabled(self):
       if self.policy().runlevelDefault() in self.policy().runlevelByService("rpcbind"):
          return True
       return False

This could be made more generic and allow easier maintenance by adding a "services" list and checking it in the default checkenabled() implementation.

That way we could avoid duplicate code between Red Hat and Debian which use different names for this service.

Add checkenabled for as7 plugin

as7 should have a checkenabled routine since at the moment it runs with every sosreport iteration. I'm not sure what packages or files to look for in order to determine this.

Thanks

All Debian pkg*() are broken

All the package checks on Debian are broken. This might explain the dramatically lower module coverage (22 vs. 50 or so on Red Hat) I've been seeing.

The cause seems to be a bad format string passed to dpkg:

"dpkg-query -W -f='${Package}|${Version}\n' *"

The double-escaped quote seems to go right through leading to piles of unreadable goo like:

sun-java5-jre|\nsun-java6-jre|\nsunbird|\nsvn-buildpackage|\nswat|\nsynaptic|\nsyslinux|2:4.05+dfsg-6\nsyslinux-common|2:4.05+dfsg-6\nsyslinux-legacy|2:3.63+dfsg-2ubuntu5\nsysstat|\nsystem-config-printer|\nsystem-config-printer-common|1.3.11+20120807-0ubuntu10\nsystem-config-printer-gnome|1.3.11+20120807-0ubuntu10\nsystem-config-printer-kde|\nsystem-config-printer-udev|1.3.11+20120807-0ubuntu10\nsystem-log-daemon|\nsystem-services|\nsysv-rc|2.88dsf-13.10ubuntu13\nsysv-rc-conf|\nsysvconfig|\nsysvinit|\nsysvinit-utils|2.88dsf-13.10ubuntu13\ntango-icon-theme|\ntar|1.26-4ubuntu1\ntcl8.5|\n

sosGetCommandOutput doesn't actually use timeout

The api function for shelling out takes a timeout parameter but does nothing with it. It would be nice to have a real timeout feature, but if we aren't going to have one the parameter should probably be removed.

Provide a reporting API

Plugins need a way to provide information to the XML/HTML report. Some thing that need to be available are ways to make sections of text as well as bits that collapse/expand. The ability to mark text as pre-formatted is necessary as well.

OpenStack module collects secrets

Current OpenStack collects /etc/keystone:

    # Keystone
    self.addCopySpecs(["/etc/keystone/",
                       "/var/log/keystone/",
                       "/etc/logrotate.d/keystone"])

This will pick up /etc/keystone/keystonce.conf which contains:

[keystone_authtoken]
...
admin_password = servicepass

This either needs to be forbidden as a path (exclude whole file) or post-processed to obscure the passwords.

Change how SoS Creates the Archive

Currently SoS shells out to TAR and XZ to create the archive for uploading to dropbox. This will not work on all platforms. We should be smarter about how we select compression technologies and choose technologies that are appropriate for the platform that the tool is running on.

Post-processed files end up in archive twice (security, usability)

We have a problem with post-processed (e.g. do*Sub()'ed) files:

# tar tvf sosreport-rhel7-vm1-20121212172335.tar.xz|grep anaconda-ks
-rw------- 0/0             903 2012-11-23 19:09 sosreport-rhel7-vm1-20121212172335/root/anaconda-ks.cfg
-rw------- 0/0             791 2012-11-23 19:09 sosreport-rhel7-vm1-20121212172335/root/anaconda-ks.cfg

After substitution we end up with a second copy of the file in the tarball. This breaks unpacking as an unprivileged user horribly:

$ tar xf sosreport-hex.usersys.redhat.com-20121212163134.tar.xz
tar: sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg: Cannot open: File exists
tar: sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg: Cannot open: File exists
tar: Exiting with failure status due to previous errors

This is a security problem since the un-substituted content remains in the tarball and it's a usability problem for certain paths (e.g. /root, files under /) since the fs permissions were lowered as part of the fscaps project. I.e. since /root is now 550 a normal user cannot unlink the results of extracting the tarball since they don't have write perms on $SOSROOT/root (they can set them as they own the path but a typical rm -rf fails):

$ tar xf sosreport-hex.usersys.redhat.com-20121212163134.tar.xz
tar: sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg: Cannot open: File exists
tar: sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg: Cannot open: File exists
$ tar tvf sosreport-hex.usersys.redhat.com-20121212163134.tar.xz | grep anaconda-ks
-rw------- 0/0            1677 2012-07-11 12:42 sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg
-rw------- 0/0            1565 2012-07-11 12:42 sosreport-hex.usersys.redhat.com-20121212163134/root/anaconda-ks.cfg

Post processing methods don't log

We log every command or file collected but currently do not log either the intent to apply a substitution or any errors that happen in the process. This can lead users to a false sense of security when a substitution is designed to obscure secrets.

do*Sub() should use debug and error logging to capture this information.

tar archive has lost ownership/permissions

The new archive infrastructure doesn't preserve ownership and permissions for the files it collects:

# chown logger:logger /var/log/messages
# ll /var/log/messages
-rw-------. 1 logger logger 4274703 Dec  6 18:23 /var/log/messages
# sosreport  --batch
[...]
# tar xf sosreport-*.xz
# ll sosreport-*/var/log/messages
-rw-r--r--. 1 root root 4274703 Dec  6 18:23 sosreport-localhost.localdomain-20121206184251/var/log/messages

This is pretty easy to fix to bring back in line with previous functionality but we've got a request now to also capture SELinux contexts - tbh it might also be a good idea to be getting ACLs and xattrs too. Need to check if these are available in the tarfile module - if not we may need to fall back to callouts.

dry run option

It is very difficult to predict all the commands that sosreport will run on a
system, and in many cases sosreport can hang if one of the commands does not
respond.

sosreport should have a dry run mode, which will list all the commands that it
will run without actually running them.

Suggestion is:

# sosreport --dry-run

The output should be a list of commands to stdout, with extra info if deemed
suitable to stderr, so that it can be easily grep'abble.

Sysfs tree collection is useless for trees with links

Currently our sysfs collection is a bit lacking. The hardware module has had a call for "/sys/bus/scsi" for years but all it manages to collect is:

tree /tmp/rhel6-vm1-2012113010231354271013/sys/bus/scsi/

/tmp/rhel6-vm1-2012113010231354271013/sys/bus/scsi/
├── drivers
│   ├── sd
│   └── sr
└── drivers_autoprobe

Much more interesting would be:

tree /sys/bus/scsi/

/sys/bus/scsi/
├── devices
│   ├── 1:0:0:0 -> ../../../devices/pci0000:00/0000:00:01.1/host1/target1:0:0/1:0:0:0
│   ├── 2:0:0:0 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:0/2:0:0:0
│   ├── 2:0:1:0 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:1/2:0:1:0
│   ├── 2:0:2:0 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:2/2:0:2:0
│   ├── 2:0:3:0 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:3/2:0:3:0
│   ├── host0 -> ../../../devices/pci0000:00/0000:00:01.1/host0
│   ├── host1 -> ../../../devices/pci0000:00/0000:00:01.1/host1
│   ├── host2 -> ../../../devices/pci0000:00/0000:00:08.0/host2
│   ├── target1:0:0 -> ../../../devices/pci0000:00/0000:00:01.1/host1/target1:0:0
│   ├── target2:0:0 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:0
│   ├── target2:0:1 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:1
│   ├── target2:0:2 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:2
│   └── target2:0:3 -> ../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:3
├── drivers
│   ├── sd
│   │   ├── 2:0:0:0 -> ../../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:0/2:0:0:0
│   │   ├── 2:0:1:0 -> ../../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:1/2:0:1:0
│   │   ├── 2:0:2:0 -> ../../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:2/2:0:2:0
│   │   ├── 2:0:3:0 -> ../../../../devices/pci0000:00/0000:00:08.0/host2/target2:0:3/2:0:3:0
│   │   ├── bind
│   │   ├── uevent
│   │   └── unbind
│   └── sr
│   ├── 1:0:0:0 -> ../../../../devices/pci0000:00/0000:00:01.1/host1/target1:0:0/1:0:0:0
│   ├── bind
│   ├── uevent
│   └── unbind
├── drivers_autoprobe
├── drivers_probe
└── uevent

22 directories, 9 files

The problem is we ignore symlinked directories which are fundamental to the layout of /sys. Hacking out that check unfortunately takes us way off into the weeds; we can't detect that symlinks in lower-level directories link back to higher levels that have already been traversed.

To do this properly we need a tree copying function that can handle these symlinks correctly.

Move pid 1 (init) stuff to startup

I'd like to propose moving all the init/startup related files from the general module to the startup module.

It seems a better place for them for a start (and helps de-clutter the general god module a bit) and we'll need to add more here to support systemd - seems sanest to have it all in one place.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.