Git Product home page Git Product logo

recap's Introduction

Recap

Build Master Status Build Development Status GitHub release GitHub license GitHub stars Twitter

recap is a system status reporting tool. A reporting script that generates reports of various information about the server.

Contribution

Contribution guidelines can be found in CONTRIBUTING.md

Dependencies

  • bash >= 4
  • coreutils
  • gawk
  • grep
  • iotop
  • iproute/iproute2
  • elinks
  • procps
  • psmisc
  • sysstat >= 9

Versioning

recap is following the x.y.z versioning as defined below:

  • x (major) - Changes that prevent at least some rolling upgrades.
  • y (minor) - Changes that don't break any rolling upgrades but require closer user attention for example configuration defaults, function behavior, tools used to produce reports, among others.
  • z (patch) - Changes that are backward-compatible including features and/or bug fixes.

Installation

It is highly recommended to make use of a package to install recap is the easiest way to keep it updated whenever there is a new release.

Fedora

recap is available from the main Fedora repository (spec file).

dnf install recap

RHEL/CentOS

recap is available from the EPEL repository (spec file).

yum install recap

Debian

Currently only available in testing and unstable. For other releases see the options to build a deb package or install from source.

The official Debian files are available in this repository

Ubuntu

At the moment there is no public repository for Ubuntu, two options are available, build a deb package or install manually, see instructions down below.

Build a deb package

This repository contains the official Debian files required to build a deb package.

These steps used to be used to build the deb package, use them as a guide:

# Install all the packages required for building the package
apt-get update
apt-get install debhelper devscripts git -y

## For Ubuntu:
apt-get install equivs -y

# Create the working dir:
mkdir recap
cd recap

# Get the Debian configs
git init
git remote add origin https://github.com/jkirk/recap.git
git fetch --no-tags origin
git checkout -qf FETCH_HEAD
git submodule update --init --recursive
export LATEST=$( git log --format="%h" --no-merges -1 )

# Build dependencies
echo "yes" | mk-build-deps --install --remove debian/control

# Get upstream recap code
git checkout --orphan upstream
git reset --hard
git remote add upstream https://github.com/rackerlabs/recap.git
git fetch -t upstream
latest_tag=$( git tag | tail -1 )
git archive ${latest_tag} -o ../recap_${latest_tag}.orig.tar.gz
tar -zxf ../recap_${latest_tag}.orig.tar.gz
git fetch --no-tags origin
git checkout ${LATEST} -- debian

# Build the package
debuild -us -uc --lintian-opts --profile debian

# Package will be created in ../recap_${latest_tag}-<RELEASE>_all.deb
# RELEASE comes from the changelog in the Debian repository.

Manual

  1. Install the required dependencies.
  2. Clone this repository: git clone https://github.com/rackerlabs/recap.git
  3. Change into the new directory: cd recap
  4. Install the program: sudo make install

The information captured will be found in log files in the /var/log/recap/ directory.

About the locations of the scripts

  • The default location of the install is "/" it can be overridden with DESTDIR.
  • The scripts, man pages and docs are installed under ""/usr/local" by default, this can be overridden with PREFIX. Main scripts are installed on in "./sbin" by default, this can be overriden with BINDIR.
  • The core scripts and the plugins are installed on top of PREFIX in "./recap/plugins-available" by default, this can be overridden with LIBDIR

The following example is a common location for most of the distributions, this will install recap under /usr:

$ sudo make PREFIX="/usr" install

This other example will install recap under your homedirectory but using the default locations for the script, i.e. under "~/usr/local":

$ make DESTDIR="~" install

The Makefile scripts attempts to detect systemd if so, the install option will install the systemd unit files. The install will not enable the timers, but it will show the commands required to enable each of the timers. When systemd is not detected the cronjobs will be installed.

Is up to each package distribution to follow their own best practices regarding enabling/disabling the timers on install/remove of the package.

Ansible

An ansible playbook could be used to install recap from a git repository. The playbook is located in tools under ansible_recap.yml the playbook can be used to install it on Red Hat based and Debian based distros. Or to uninstall it defining the uninstall variable.

Variables

  • repo - The location of the repository, default: https://github.com/rackerlabs/recap.git.
  • ref - The reference to use this could be a branch, a tag or commit, default: master.
  • binpath - The value of BINPATH, default: /sbin.
  • destdir - The value of DESTDIR, default: "".
  • prefix - The value of PREFIX, default: /usr.
  • tmp_install_dir - The location where the cloned repo will be placed, default: /tmp/recap.
  • uninstall - Then this is defined it will remove recap, default: undefined.
  • enable_plugins - To enable the global plugin configuration default: false.
  • plugin_list - A list of plugins to enable, from the plugin-available directory, default: all.

Install (default)

Install the stable version of recap:

ansible-playbook tools/ansible_recap.yml

Install the development version of recap:

ansible-playbook tools/ansible_recap.yml -e ref=development

Install branch foo from a different repository:

ansible-playbook tools/ansible_recap.yml -e ref=foo -e repo=https://github.com/bar/recap.git

Install recap with BINPATH in /bin:

ansible-playbook tools/ansible_recap.yml -e binpath=/bin

Install recap with all plugins enabled:

ansible-playbook tools/ansible_recap.yml -e enable_plugins=true

Install recap with a list of plugins to enable:

ansible-playbook tools/ansible_recap.yml \
  -e enable_plugins=true \
  -e '{"plugin_list":[docker_top,redis]}'

Uninstall

Uninstall recap from the default path:

ansible-playbook tools/ansible_recap.yml -e uninstall=yes

Uninstall recap from a custom location:

ansible-playbook tools/ansible_recap.yml -e uninstall=yes -e destdir=/tmp/test

Cron/Timers and Configuration

Timers(systemd)

Multiple unit files are available to make use of timers, here the default schedules for the recap scripts:

  • recap (default every 10min)
  • recap-onboot (runs at boot time)
  • recaplog (default: Once a day 1am)

Enabling timers

Each one of the timers can be enabled with:

sudo systemctl enable recap.timer --now"
sudo systemctl enable recaplog.timer --now"
sudo systemctl enable recap-onboot.timer --now"

Disabling timers

Each one of the timers can be disabled with:

sudo systemctl disable recap.timer --now"
sudo systemctl disable recaplog.timer --now"
sudo systemctl disable recap-onboot.timer --now"

Cron

The cron file (/etc/cron.d/recap) is used to determine the execution time of recap and recaplog. By default the cron execution for recap is enabled to run every 10 min. and recaplog is expected to run every day at 1 am, but those can be adjusted as needed.

Configuration

The following variables are commented out with the defaults values in the configuration file /etc/recap.conf which can be overridden.

Settings shared by recap scripts

  • BASEDIR - Directory where the logs are saved.

    Default: BASEDIR="/var/log/recap"

  • LIBDIR - Directory where the libraries/functions are located.

    The default value depends on the PREFIX used when installing, the default PREFIX on the Makefile is /usr/local, then:

    Default: LIBDIR="/usr/local/lib/recap"

    But packages use /usr as the PREFIX, then through a package it is expected to be:

    Default: LIBDIR="/usr/lib/recap"

Settings used only by recaplog

  • LOG_COMPRESS - Enable or disable log compression.

    Default: LOG_COMPRESS=1

  • LOG_EXPIRY - Log files will be deleted after LOG_EXPIRY days

    Default: LOG_EXPIRY=15

Settings used only by recap

  • MAILTO - Send a report to the email defined.

    Default: MAILTO=""

  • MIN_FREE_SPACE - Minimum free space (in MB) required in ${BASEDIR} to run recap, a value of 0 deactivates this check.

    Default: MIN_FREE_SPACE=0

Reports

These are the type of reports generated and their dependencies.

fdisk
  • USEFDISK - Generates logs from "fdisk ${OPTS_FDISK}"

    Default: USEFDISK="no"

mysql
  • USEMYSQL - Generates logs from "mysqladmin status"

    Makes use of DOTMYDOTCNF.

    Required by: USEMYSQLPROCESSLIST, USEINNODB

    Default: USEMYSQL="no"

  • USEMYSQLPROCESSLIST - Generates logs from "mysqladmin processlist"

    Makes use of DOTMYDOTCNF and MYSQL_PROCESS_LIST

    Requires: USEMYSQL

    Default: USEMYSQLPROCESSLIST="no"

  • USEINNODB - Generates logs from "mysql show engine innodb status"

    Makes use of DOTMYDOTCNF

    Requires: USEMYSQL

    Default: USEINNODB="no"

netstat
  • USENETSTAT - Generates network stats from "ss ${OPTS_NETSTAT}"

    Required by: USENETSTATSUM

    Default: USENETSTAT="yes"

  • USENETSTATSUM - Generates logs from "nstat ${OPTS_NETSTAT_SUM}".

    Requires: USENETSTAT

    Default: USENETSTATSUM="no"

ps
  • USEPS - Generates logs from "ps"

    Options can be modified in OTPS_PS

    Default: USEPS="yes"

pstree
  • USEPSTREE - Generates logs from pstree

    Options can be modified in OTPS_PSTREE

    Default: USEPSTREE="no"

resources
  • USERESOURCES - Generates "resources"(uptime, free, vmstat, iostat, iotop) log

    Required by: USEDF, USESLAB, USESAR, USESARQ, USESARR

    Default: USERESOURCES="yes"

  • USEDF - Generates logs from df

    Requires: USERESOURCES

    Options can be modified in OPTS_DF

    Default: USEDF="yes"

  • USESLAB - Generates logs from the slab table.

    Requires: USERESOURCES

    Default: USESLAB="no"

  • USERSAR - Generates logs from sar.

    Requires: USERESOURCES

    Default: USESAR="yes"

  • USESARQ - Generates logs from "sar -q" (logs queue length, load data)

    Requires: USERESOURCES

    Default: USESARQ="no"

  • USESARR - Generates logs from"sar -r" (logs memory data)

    Requires: USERESOURCES

    Default: USESARR="no"

Options

Options used by the tools generating the reports

  • DOTMYDOTCNF - Defines the path to the mysql client configuration file

    Required by: USEMYSQL, USEMYSQLPROCESSLIST, USEINNODB

    Default: DOTMYDOTCNF="/root/.my.cnf"

  • MYSQL_PROCESS_LIST - Format to display MySQL process list, options are "table" or "vertical".

    Required by: USEMYSQLPROCESSLIST

    Default: MYSQL_PROCESS_LIST="table"

  • OPTS_DF - df options

    Required by: USEDF

    Default: OPTS_DF="-x nfs"

  • OPTS_FDISK - Option used by USEFDISK.

    Required by: USEFDISK

    Default: OPTS_FDISK="-l"

  • OPTS_FREE - free options

    Required by: USEFREE

    Default: OPTS_FREE=""

  • OPTS_IOSTAT - iostat options

    Required by: USERESOURCES

    Default: OPTS_IOSTAT="-t -x 1 3"

  • OPTS_IOTOP - iotop options

    Required by: USERESOURCES

    Default: OPTS_IOTOP="-b -o -t -n 3"

  • OPTS_NETSTAT - ss options

    Required by: USENETSTAT

    Default: OPTS_NETSTAT="-atunp"

  • OPTS_NETSTAT_SUM - nstat options

    Required by: USENETSTATSUM

    Default: OPTS_NETSTAT_SUM="-a"

  • OPTS_PS - ps options

    Required by: USEPS

    Default: OPTS_PS="auxfww"

  • OPTS_PSTREE - pstree options

    Required by: USEPSTREE

    Default: OPTS_PSTREE="-p"

  • OPTS_VMSTAT - vmstat options

    Required by: USERESOURCES

    Default: OPTS_VMSTAT="-S M 1 3"

Plugins

  • USEPLUGINS - Enable/Disable plugins usage.

    Default: USEPLUGINS=no

Plugins are stored in the plugin directory, defined by LIBDIR/plugins-available

Default: /usr/lib/local/recap/plugin-available

Enabling plugins

To enable plugins, the following is required to:

  • Setting USEPLUGINS="yes" in /etc/recap.conf
  • Symlinking plugins-enabled/plugin_name to plugins-available/plugin_name

Naming conventions:

  • Plugin scripts can be named in any way. It's desired that they describe the purpose of the plugin in one word. When multiple words are required use underscores (_). Don't use extensions or dates (e.g. YYYYMMDD) in the plugin name. Some examples:

    • Good names for plugins
      • redis
      • memcache
      • docker_images
      • memcache_13
    • Bad names for plugins
      • johndoe_apache (not very descriptive)
      • myplugin (non explicit)
      • test.sh (non explicit, using extension)
      • recap-plugin (non explicit, using hyphens)
      • Sendmail (CamelCase)
      • redis.bak (extension)
      • ms sql (space between words)
      • reports_20202020 (use of a date)
  • Allowed naming convention for plugin OPTIONS in /etc/recap.conf: PLUGIN_OPTS__<OPT_NAME> Some examples:

    • Good plugin option names:
      • PLUGIN_OPTS_MEMCACHE_PROTO
      • PLUGIN_OPTS_AWS_KEY
      • PLUGIN_OPTS_REDIS_PORT
      • PLUGIN_OPTS_DOCKER_HUB_URL
    • Bad plugin option names:
      • plugin_opts_my_plugin (lower case)
      • PLUGIN_OPTS_MY_VARIABLE (lacking plugin reference)
      • PLUGIN_OPTS_DOCKER_port (CamelCase)
      • PLUGIN-OPTS-NTP (using hyphens instead of underscores, missing the option)
  • Inside the plugin file/script it is expected only functions. recap will only call one function: print_<plugin_name> where plugin_name must match the name of the file.

  • Optionally, other functions can be defined to create different entries in the log. Those other functions could be controlled by plugin variables (PLUGIN_OPTS__<OPT_NAME>). Those variables are set in /etc/recap.conf and conditionally called from the main plugin function plugin_name

  • Any plugin variable defined must have a default value.

  • The plugins are expected to follow some of the practices followed in recap. Please refer to CONTRIBUTING.md

  • A template of a plugin is provided in doc/plugin_template

Changelog & Contributions

Information about changes and contributors is documented in the CHANGELOG.md

License

recap is licensed under the GNU General Public License v2.0

recap's People

Contributors

b-harper avatar bhgraham avatar buzzboy23 avatar carlwgeorge avatar codeodor avatar eljrax avatar jamesbelchamber avatar jamrok avatar jaygoldberg avatar jsoref avatar lil-cain avatar lukehandle avatar man-chung avatar rax-rstark avatar schwing avatar seanorama avatar thebwt avatar thtieig avatar tonyskapunk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

recap's Issues

NOTICE: This repo is moving.

On the morning of Saturday, March 23, 2013, the 'rackspace' organization on GitHub will be reorganized. All repos will be moved to the new 'rackerlabs' organization, except for those that are designed to be used by Rackspace customers and which are fully supported.

Please update any links to this repo to reflect the new location within GitHub. For example, if the link to your repo is 'https://github.com/rackspace/foo', you need to change it to 'https://github.com/rackerlabs/foo'.

recap.conf enables FULLSTATUS by default causing email spam (bad config)

The recap.conf configuration has this enabled/on by default:
https://github.com/rackerlabs/recap/blob/master/src/recap.conf#L54

# Use "service httpd fullstatus"
# Please ensure that the Apache is configured to allow access to the
# http://localhost/server-status URL before enabling this option
USEFULLSTATUS=yes

The internal default for this variable is no:
https://github.com/rackerlabs/recap/blob/master/src/recap#L66

USEFULLSTATUS="no"

This is broken in a few ways:

  1. The code uses 'apachectl fullstatus' not 'service httpd fullstatus' as documented
  2. The default external config does not mirror the internal defaults; all out of the box configs should mirror the application's default behaviour, not alter it
  3. By enabling this, the system keeps running the command when Apache is not installed and/or running and emailing root@localhost by default every single time it runs

The config file should mirror the internal default and be set to =no to work correctly out of the box on a server that does not have Apache installed or does not have it running. As is, it's spamming my root@localhost with the error output every 10 minutes (default cron) when it runs. Thanks!

hardcoded PATH for /etc/cron.d/recap

PR #82 exposed an issue with the src/recap.cron file as it has hardcoded PATH to the scripts used by it.

Here reproducing the problem:

root@docker:/recap# make --dry-run install
echo "Installing scripts..."
install -Dm0755 src/recap /usr/local/sbin/recap
install -Dm0755 src/recaplog /usr/local/sbin/recaplog
install -Dm0755 src/recaptool /usr/local/sbin/recaptool
echo "Installing man pages..."
install -Dm0644 src/recap.5 /usr/local/share/man/man5/recap.5
install -Dm0644 src/recap.8 /usr/local/share/man/man8/recap.8
install -Dm0644 src/recaplog.8 /usr/local/share/man/man8/recaplog.8
install -Dm0644 src/recaptool.8 /usr/local/share/man/man8/recaptool.8
echo "Installing configuration..."
install -Dm0644 src/recap.conf /etc/recap
echo "Installing cron job..."
install -Dm0644 src/recap.cron /etc/cron.d/recap
echo "Installing docs..."
install -Dm0644 CHANGELOG README.md COPYING -t /usr/local/share/doc/recap
echo "Creating log directories..."
install -dm0750 /var/log/recap
install -dm0750 /var/log/recap/backups
install -dm0750 /var/log/recap/snapshots

root@docker:/recap# make install
Installing scripts...
Installing man pages...
Installing configuration...
Installing cron job...
Installing docs...
Creating log directories...

root@docker:/recap# which recap
/usr/local/sbin/recap

root@docker:/recap# whatis recap --debug 2>&1 | grep ^path=
path=/usr/local/share/man
path=/usr/share/man

The only problem is the cron:

root@docker:/recap# grep \/recap /etc/cron.d/recap
@reboot root /usr/sbin/recap -B
#0 2 * * * root /usr/sbin/recap
#0 2,14 * * * root /usr/sbin/recap
#0 2,8,14,20 * * * root /usr/sbin/recap
#0 * * * * root /usr/sbin/recap
#*/30 * * * * root /usr/sbin/recap
*/10 * * * * root /usr/sbin/recap
#*/5 * * * * root /usr/sbin/recap
0 1 * * * root /usr/sbin/recaplog

Which points to the /usr/sbin instead of where PREFIX is pointed which by default is /usr/local References for this:

change processor utilisation report; report inaccurate

Hi team!

I'd like to point out that relying on 'ps' for processor utilisation can lead to incorrect conclusions and if used for alert generation, can raise tickets based on outdated data. Why? From the ps man page:

CPU usage is currently expressed as the percentage of time
spent running during the entire lifetime of a process.
This is not ideal, and it does not conform to the standards
that ps otherwise conforms to. CPU usage is unlikely to add
up to exactly 100%.

Looking at processsor utilisation via "ps" isn't worthless; it is useful for identifying long-term trends but consider this: you've had a server that has been up for a few months and had a very heavy load but it is a single server behind a load balancer, and now your budget allows you to add three more nodes (and actually make use of the load balancer!) so you add the servers and pool members but have not restarted the java process on the original node. You want to make sure that the load is evenly spread so you install recap for monitoring and have your ticketing system parse the logs to generate alerts where appropriate. Your ticketing system is now generating alerts for the original node because the CPU utilisation reported by PS is based on many months' worth of of 90%+ CPU utilisation and the average will take weeks to drop to your "warning" threshold.

A better method would be to run something like "top -d 5 -n 2" (naturally additional switches will be required for full output), discarding the first report and capturing the second; this would give a more relevant metric to raise alerts on and eliminate false alerts resulting from a historical average that has been addressed but not yet reflected by ps -- or maybe there is another method which would be less cumbersome.

But, the bottom line is relying on ps can result in false conclusions if you're trying to diagnose an issue, have applied some changes and are trying to figure out why it is still reporting high utilisation (when using ps) and generating alerts.

Standardize options across scripts

The way the options/arguments are handled in the scripts(recap, recaptool and recaplog) are different, needs to be standardize, ideally making use of getopt instead of manually parsing args or using getopts.

Standardize help/usage

Ensure all the scripts use the same way to produce the help/usage via CLI, I'm bias in this aspect as I prefer the use of embedded help in the comments as in recaplog: here and here

Document MAXLOAD option

Tried to run recap on a moderately loaded server, only to get a MAXLOAD related error.

Was able to get it to run by reading the source and setting the MAXLOAD option, but this is not documented.

Please add to the docs :)

Recap does not use custom my.cnf files for mysql process list

The script does not use the custom my.cnf files that are specified in /etc/recap. More precisely, line 328 and 335:


                            mysql -e "show full processlist\G" >> $MYSQL_FILE

                    mysqladmin -v processlist >> $MYSQL_FILE

The above throws an error if MySQL root requires a password... I guess the whole block should be enclosed in a 'for cnf in $DOTMYDOTCNF' and use --defaults-extra-file=$cnf, or something similar.

Problem encountered on a RackSpace cloud server running Ubuntu 14.04 LTS.

750 log directory permissions

I'm working on packaging recap for Fedora/EPEL. Rpmlint complains about the 750 permissions on the log directories.

recap.noarch: E: non-standard-dir-perm /var/log/recap 750
recap.noarch: E: non-standard-dir-perm /var/log/recap/backups 750
recap.noarch: E: non-standard-dir-perm /var/log/recap/snapshots 750

These were set this way in 2012. Are these permissions actually necessary? Is any recap output actually sensitive?

References:

Feature request: top swap consumers

When we get paging tickets there is often only a very brief window where we can capture what is hitting swap - yesterday I sat on an open ticket on a server that was flapping, because I missed the paging event then I finally caught it in action. It would be great if this script could be included in recap/rs-sysmon (you'll recognize this one from one of the Linux one-liners page - not mine):

# echo -e "\n== Top 10 Processes By Swap Usage: =="; ( printf "%s\t%s\t%s\n" "PID" "PROCESS" "SWAP"; (for i in /proc/[0-9]*; do PID=${i#/proc/}; NAME=$(awk '/Name/ {print $2}' ${i}/status 2>/dev/null); SWAP=$(awk '/Swap/ { sum+=$2 }; END { print sprintf("%.2f", sum/1024) "M" }' /${i}/smaps 2>/dev/null);echo ${PID} ${NAME} ${SWAP}; done | awk '!/0.00M/' |sort -grk3,3 | head -10)) | column -t;

It would be even better if runbook automation could capture this in ticket updates in response to alerts - but I suppose that would be an IAW thing?

Fix inconsistency between BACKUP_ITEMS and USE_

Back in issue #94 when working on the documentation and configuration of the settings, it was clear that there is a inconsistency between the list of BACKUP_ITEMS and the USE_ variables (#94 (comment)), it's required to be more consistent on those settings, likely to deprecate the use of BACKUP_ITEMS or create it in a dynamic way based on the USE_ variables.

recap log.gz files don't rotate

I have noticed the recap gzip files don't rotate properly after looking into it the regex is wrong inside /usr/sbin/recaplog here is a example :

find $BASEDIR -maxdepth 1 -regextype grep -regex "${BASEDIR}/[a-z]*_[0-9]{8}-[0-9]{6}.log.gz" -type f -mtime +$LOG_EXPIRY -exec echo rm {} ; | tee -a $LOGFILE

The file name is the following that gets made when rotated :

netstat_daily_20160602.log.gz

So as you can see there is no - and actually another _ . If we where rotating just the log file this would have worked but this is not the case :

ps_20160607-012001.log

As you see it has the single _ and a single - between the dates .

I created a temporary work around on my side by replacing the find regex

FROM:

-regex "${BASEDIR}/[a-z]*_[0-9]{8}-[0-9]{6}.log.gz"

TO:

-regex "${BASEDIR}/[a-z][a-z][0-9]*.log.gz"

Now it actually finds the gzip files and it can rotate them out as per the configuration.

Move daily files in to subfolder

Wanted to suggest a feature enhancement that log.gz files be placed in a subfolder such as:
/var/log/recap/daily

This may make it a bit easier to parse/organize log files under recap that has been running for some time. Happy to make the PR if this seems like a good enhancement to others.

Errors if apachectl is not found

Since Rackspace installed recap on our server, I've received messages from cron about apachectl not being found. I've since had a look at it and edited the configuration to change the USEFULLSTATUS variable, but I think that the script should handle the situation more gracefully under its default configuration to avoid irritating server administrators.

Add log to recap

It would be nice to have recap to produce a log, this would be helpful to report possible conflicts with configs, deprecated options, missing tools, permissions issues, etc.

ls: cannot access fdisk_*.log: No such file or directory

Issue with '-B' option:

# recap -B
ls: cannot access fdisk_*.log: No such file or directory
ls: cannot access mysql_*.log: No such file or directory
ls: cannot access netstat_*.log: No such file or directory
ls: cannot access ps_*.log: No such file or directory
ls: cannot access pstree_*.log: No such file or directory
ls: cannot access resources_*.log: No such file or directory

create branches for dev and stable releases

Request from the Rackspace RPM-dev team is to create branches with tags for dev and stable releases. RPM builds at Rackspace (and any other builds for other packagers) would be pulled from stable releases where a tar would be created.

Fdisk warning about GPT partiion

Hi guys

Wanted to find out if you could assist?

Recap has been configured to notify the recipient via email of any errors, one of the errors that is constantly been highlighted is that fdisk sees a GPT partition, every time this runs its fires off an email.

Do you have a fix for this? EG:


fdisk -l /dev/sdb

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.

Disk /dev/sdb: 91.3 GB, 91268055040 bytes
255 heads, 63 sectors/track, 11096 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot Start End Blocks Id System
/dev/sdb1 1 11097 89128959+ ee GPT

recap 2>&1 >/dev/null

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.


As this is not really an error it would be good if this could be squashed so no email is triggered for this kind of Warning.

Thanks
Erran

reorganize repo

I'll take input from anyone, but specifically I'm looking for @bhgraham, @siso, and @buzzboy23 opinions. What do you think about organizing the files into directories? I created a branch "organize" where you can see the direction I am thinking about. Specifically, move core files into src, and supplementary stuff into util.

Please issue with MySQL

Quick hack:

We’re seeing a somewhat recent issue with Plesk 11 servers where they will refuse to perform their automatic update because of the existence of the /root/.my.cnf file but rs-symon needs this because it simply runs a mysqladmin somewhere around lines 238 and 244. A simple hack is to define –uadmin –p`cat /etc/psa/.psa.shadow though a simple Plesk existence check would be best. I’m thinking we could use something like this:

print_mysql() {
        if [ -d /etc/psa/.psa.shadow ]
        then
                echo "MySQL status" >> $MYSQL_FILE
                mysqladmin -uadmin -p`cat /etc/psa/.psa.shadow` status >> $MYSQL_FILE
        else
                echo "MySQL status" >> $MYSQL_FILE
                mysqladmin status >> $MYSQL_FILE
        fi
}

# print the output of "mysqladmin processlist" to the mysql file
print_mysql_procs() {
        if [ -d /etc/psa/.psa.shadow ]
        then
                echo "MySQL processes" >> $MYSQL_FILE
                mysqladmin -uadmin -p`cat /etc/psa/.psa.shadow` -v processlist >> $MYSQL_FILE
        else
                echo "MySQL processes" >> $MYSQL_FILE
                mysqladmin -v processlist >> $MYSQL_FILE
        fi
}

Config file location

Based on the discussion on #79 revisit the location for the configuration file, currently in /etc/recap the discussion in that PR suggests that it might be better to have it under /etc/recap.conf we can discuss the pros, cons in this issue rather that on the PR.

mysql functions using --defaults-extra-file

Currently recap on a couple of mysql functions is making use of --defaults-extra-file this is causing that any argument defined later in other sourced locations override what's defined in that configuration passed, thus failing as the example below demonstrates. This is mainly important when there are more than one configuration passed to /etc/recap.

The order the config files are sources is documented here: https://dev.mysql.com/doc/refman/5.7/en/option-files.html

File Name Purpose
/etc/my.cnf Global options
/etc/mysql/my.cnf Global options
SYSCONFDIR/my.cnf Global options
$MYSQL_HOME/my.cnf Server-specific options (server only)
defaults-extra-file The file specified with --defaults-extra-file, if any
~/.my.cnf User-specific options
~ /.mylogin.cnf User-specific login path options (clients only)

This is an example of how recap would use --defaults-extra-file and fail

  • Config: /root/.my.cnf:
[client]
user=root
password=InvalidPassword
socket=/var/lib/mysql/mysql.sock
  • Config /root/.my2.cnf:
[client]
user=root
password=ValidPassword
socket=/var/lib/mysql/mysql.sock

This is what the mysql functions are doing inside recap:

[root@gnu ~]# mysqladmin --defaults-extra-file=/root/.my2.cnf ping; \
              mysqladmin --defaults-extra-file=/root/.my2.cnf --print-defaults | fold -w 89
mysqladmin: connect to server at 'localhost' failed
error: 'Access denied for user 'root'@'localhost' (using password: YES)'
mysqladmin would have been started with the following arguments:
--user=root --password=ValidPassword --socket=/var/lib/mysql/mysql.sock --user=root 
--password=InvalidPassword --socket=/var/lib/mysql/mysql.sock 

In this example is clear that what's in /root/my.cnf(a default config) will override what's passed to --defaults-extra-file because of the order the arguments are sourced, resulting in a failure.

The proposal is to use --defaults-file option instead of --defaults-extra-file. This is the result of that:

[root@gnu ~]# mysqladmin --defaults-file=/root/.my2.cnf ping; \
              mysqladmin --defaults-file=/root/.my2.cnf --print-defaults | fold -w 89
mysqld is alive
mysqladmin would have been started with the following arguments:
--user=root --password=ValidPassword --socket=/var/lib/mysql/mysql.sock 

need manpage for recaptool

We have man pages for recap and recaplog, but not for recaptool. Can someone familiar with that utility write one?

man page for recaplog not used

The man page for recaplog exists in the repo, but it isn't used by the bash installer or the Makefile (or the spec file).

Convert [ ] to [[ ]]

While reviewing PR #100 I noticed in about 3 places there are old style test operators [ ] use which need to get upgraded to builtin [[ ]] for modern bash4 best practices. Very low priority, since we're using parameter expansion in variables we require bash4 already.

Apache Fullstatus broken via Curl

Hi there,

It appears that changes were made to remove elinks requirements from spec/rpm:
https://src.fedoraproject.org/rpms/recap/blob/master/f/recap.spec (as not everyone needs Apache fullstatus, I understand the thought process)

And thus, the project switched to use a curl instead of apachectl fullstatus (e/links). This is not adequate for reporting Apache server-status though:

Web Status report
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html><head>
<title>Apache Status</title>
</head><body>
<h1>Apache Server Status for localhost</h1>

<dl><dt>Server Version: Apache/2.2.15 (Unix) DAV/2 PHP/5.4.43 mod_ssl/2.2.15 OpenSSL/1.0.1e-fips mod_perl/2.0.4 Perl/v5.10.1</dt>
<dt>Server Built: Jul 23 2014 07:06:18
</dt></dl><hr /><dl>
<dt>Current Time: Thursday, 12-Oct-2017 16:23:12 CEST</dt>
<dt>Restart Time: Thursday, 12-Oct-2017 14:42:27 CEST</dt>
<dt>Parent Server Generation: 0</dt>
<dt>Server uptime:  1 hour 40 minutes 45 seconds</dt>
<dt>Total accesses: 171637 - Total Traffic: 1.1 GB</dt>
<dt>CPU Usage: u396.22 s220.44 cu0 cs0 - 10.2% CPU load</dt>
<dt>28.4 requests/sec - 194.9 kB/second - 6.9 kB/request</dt>
<dt>72 requests currently being processed, 14 idle workers</dt>
</dl><pre>KKWCCKCCKCKWCKKKK_KC.K....CWKKKKK.._C._._.._.C.K..__C_..W.KK....
K....KWKK...KK..CKWK_CKCW...C.K.CK....K_CKCK.K.CKC.C.__KKW..W..K
.__WWWW.........................................................
................................................................
................................................................
................................................................
................................................................
................................................................
</pre>
<p>Scoreboard Key:<br />
"<b><code>_</code></b>" Waiting for Connection,
"<b><code>S</code></b>" Starting up,
"<b><code>R</code></b>" Reading Request,<br />
"<b><code>W</code></b>" Sending Reply,
"<b><code>K</code></b>" Keepalive (read),
"<b><code>D</code></b>" DNS Lookup,<br />
"<b><code>C</code></b>" Closing connection,
"<b><code>L</code></b>" Logging,
"<b><code>G</code></b>" Gracefully finishing,<br />
"<b><code>I</code></b>" Idle cleanup of worker,
"<b><code>.</code></b>" Open slot with no current process</p>
<p />


<table border="0"><tr><th>Srv</th><th>PID</th><th>Acc</th><th>M</th><th>CPU
</th><th>SS</th><th>Req</th><th>Conn</th><th>Child</th><th>Slot</th><th>Client</th><th>VHost</th><th>Request</th></tr>
[...]

I've highlighted this against the commit as well:
3f4b520#commitcomment-24934576

Further to this, the default status URL here is a bit strange:

#OPTS_STATUSURL="http://localhost:80/"

Traditionally this has been an Apache specific full status option - is does not seem logical to change this default to point to the homepage of default VHost. This should instead default to http://localhost:80/server-status.

===

I suggest maybe adding a check for when e/links is present to instead run:
/usr/bin/links -dump http://localhost:80/server-status

can fill logging partition and kill server

(Moved here from original rs-sysmon tracking with very few edits)

It doesn't happen often, but recap can kill a server. Going on the premise that recap should go to lengths to ensure it's part of the solution, never part of the problem, I consider this a bug.

When configured to collect huge datasets, creating a new one ever 5 or 10 minutes can quickly fill the device behind /var/log/recap/

Example: a database server with both mysql logging options turned on, mysql was coping properly w/ approx. 4,000 active connections, until recap's logfiles, at between .5 and 1 GB each, filled the device and havoc ensued.

Recommendations:
-when recap runs, it should first ensure the device has at least 2x(sum of sizes of all recap logs created last run) free space before executing/writing any logs.
-log to /var/log/messages (and possibly 'wall' or send email to admin address as well) something like "recap: monitoring pass aborted due to space exhaustion on device. Take steps to clear space immediately!"
-possibly run sync right before exiting

recap installer vs makefile

Seems a duplicated effort having two scripts that produce almost the same outcome, here the differences between them:

  1. Recap-installer is more explicit with directory permissions than Makefile:
| Directory                | recap-installer | Makefile |
| /var/log/recap           |            0700 |     0750 |
| /var/log/recap/backups   |            0700 |     0750 |
| /var/log/recap/snapshots |            0700 |     0750 |
  1. The recap-installer places the docs in a directory dependent of the version, The makefile is generic.
  2. The recap-installer does not include recaptool.8 man page
  3. The recap-installer gzips the man pages, while the Makefile does not, but for instance the RPM spec takes care of this.
  4. The recap-installer provides support to macosx for the cronjob, Makefile does not.
  5. The recap-installer looks for the httpd config directory if found the recap.httpd.conf file is copied over, this is disabled by default, i.e commented out.
  6. Makefile is more flexible than recap-installer on the path to be installed.

MySQL defaults-file location(s)

Per discussion here:
#78 (comment)

We should decide whether we're going to use 1x my.cnf defaults-file, or if we're going to attempt to read multiple configs (and subsequently, potentially report on multiple SQL instances).

Turn of output to stdout?

I just received our first Rackspace server having recap installed so I have no prior experience with it.
I noticed that by default I receive a nightly e-mail at 01:00 with the output from recaplog which I've pinpointed to the last line in /etc/cron.d/recap:

# Pack and clear log files
0 1 * * * root /usr/sbin/recaplog

This e-mail provides no value to me so I'd rather like to shut it off. I noticed someone touching the same subject in #35 but it was closed?

What it comes down to is that you pass logging to tee in recaplog (https://github.com/rackerlabs/recap/blob/master/src/recaplog#L62) which outputs to stdout as well as the log file.

I'm aware I could just > /dev/null in the cron file but it could just as well be that a future upgrade then overwrites this again so maybe it'd be better to be able to pass a configuration directive to not print log messages to stdout?
Or maybe you have another idea of how to go about this?

Many thanks!

Increase default MAXLOAD

Hey there,

I see in #100 we made changes to the default MAXLOAD variable:

NEW: MAXLOAD - 10*cpu(dynamic by default, was 1000)
OLD: MAXLOAD="1000"

This had previously been set over in 4c03a4c with the logic that we'd rather have logging in the event of an issue and further crash than massive empty gaps where we are left guessing what might have been going on.

While it is agreeable that recap and the tools it triggers will contribute to the load, the "black box" style information of why we crashed is preferred default. I've spoken with Man Chung and he still believes that this value is too low - though he might chime in with more words.

recap is "touching" rs-sysmon package files - should recap obsolete rs-sysmon

Noticed the %post script is doing some changes if rs-sysmon files are found in the system where recap is being installed, based on those I submitted #58 but since there is no interaction between recap and rs-sysmon RPMs I think this is a good opportunity to define whether or not recap should Obsoletes: rs-sysmon ?

IMO seems like this is a good use case of the use of Obsoletes as recap is a rename of rs-sysmon

Comments?

Package recap as .rpm and .deb files

Please package recap as .rpm and .deb files for simpler install and management on the server side. It is a great tool, but it is not acceptable to clone live code down from github on production machines.

I'm happy to help with the SPEC files and package creation, but wanted to log this issue so that it is visible and we can discuss and collaborate.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.