Git Product home page Git Product logo

brick's Introduction

About me

Role: Systems Administrator

Experience
  • support: troubleshooting, training, documentation
  • proxies & web servers: Squid, Apache, Nginx, HAProxy, IIS
  • mail servers: Postfix, Dovecot, Roundcube, DKIM, Postgrey
  • config/change management: Subversion, Git, Ansible
  • containers: Docker, LXD
  • virtualization: VMware, Hyper-V, VirtualBox
  • databases: MySQL/MariaDB, PostgreSQL, Microsoft SQL Server
  • monitoring: Nagios, custom tooling, Microsoft Teams, fail2ban
  • logging: rsyslog (local, central receivers), Graylog
  • ticketing: Redmine, GitHub, GitLab, Service Now

Role: Intermediate developer

Experience
  • current:
    • Go, Python, PowerShell, shell scripting
    • MySQL/MariaDB, SQLite
    • Docker, LXD
    • Markdown, Textile, MediaWiki, reStructuredText, HTML, CSS
    • Redmine, GitHub (including GitHub Actions), Gitea, GitLab
  • past: batch files (don't laugh, it gets the job done), Perl
  • academic: C, C++

brick's People

Contributors

atc0005 avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

dli-klin

brick's Issues

Splunk Webhook request format

Per the Splunk Enterprise Alerting Manual, this is the format of the received POST request:

{

	"result": {
		"sourcetype" : "mongod",
		"count" : "8"
	},
	"sid" : "scheduler_admin_search_W2_at_14232356_132",
	"results_link" : "http://web.example.local:8000/app/search/@go?sid=scheduler_admin_search_W2_at_14232356_132",
	"search_name" : null,
	"owner" : "admin",
	"app" : "search"
}

Webhook data payload

The webhook POST request's JSON data payload includes the following details.

Search ID or SID for the saved search that triggered the alert
Link to search results
Search owner and app
First result row from the triggering search results

Review and update Getter methods (related logic/settings) to provide appropriate defaults

As an example, here is how the Config.DisabledUsersFile() method is defined:

// DisabledUsersFile returns the user-provided path to the EZproxy include
// file where this application should write disabled user accounts or the
// default value if not provided. CLI flag values take precedence if provided.
func (c Config) DisabledUsersFile() string {

	switch {
	case c.cliConfig.DisabledUsers.File != nil:
		return *c.cliConfig.DisabledUsers.File
	case c.fileConfig.DisabledUsers.File != nil:
		return *c.fileConfig.DisabledUsers.File
	default:
		// FIXME: During development the default is set to a fixed/temporary
		// path. Before MVP deployment the defaults should be changed to empty
		// strings?
		return defaultDisabledUsersFile
	}
}

and here is how the defaultDisabledUsersFile constant is defined (spacing tweaked):

defaultDisabledUsersFile    string = "/var/cache/brick/users.brick-disabled.txt"

and here is how it is checked inside of Config.validate():

if c.DisabledUsersFile() == "" {
	return fmt.Errorf("path to disabled users file not provided")
}

If the sysadmin provides no value for the configuration setting they will end up with /var/cache/brick/users.brick-disabled.txt as the default value, otherwise if they specify something that will be the value returned by the Getter method. Lastly, if they supply a blank response they'll either get "caught" by the flag or config-file handling packages or the Config.validate() method will catch it and complain.

Is this the desired behavior? If so, what about other defaultXYZ constants which supply values when omitted by the sysadmin?

Worth emphasizing: this is already the documented behavior as of v0.1.1 (https://github.com/atc0005/brick/blob/v0.1.1/docs/configure.md).

Synchronize help text between README, struct tags and struct doc comments

While preparing for an initial release I found some inconsistency between them (ignoring the type name added to doc comments). Instead of focusing on that now (trying to get this app "out the door"), I'm creating this issue as a TODO item for later refactoring/cleanup efforts.

Example: disabled user and reported user permissions field details.

README | Check build output paths

While "borrowing" from the existing README for a new project (really need to take the time to setup a template repo), I noticed that the paths referenced don't appear to be correct:

1. Copy the newly compiled binary from the applicable path below and deploy
   using the instructions provided in our [deployment doc](deploy.md).
   - if using `Makefile`: look in `/tmp/release_assets/brick/`
   - if using `go build`: look in `/tmp/brick/`

The last sub-bullet appears to be correct based on that section's guidance to build from the /tmp directory as the brick subdirectory will be the base path after cloning this repo. The first link however unintentionally omits the initial brick subdirectory.

Likely fix:

-   - if using `Makefile`: look in `/tmp/release_assets/brick/`
+   - if using `Makefile`: look in `/tmp/brick/release_assets/brick/`

Feature: Add support for including Redmine metadata in email notifications

Once support for email notifications is added, extend that support to include support for optional metadata/custom fields:

  • Splunk alert id
  • Tracker (e.g., "issue type", perhaps "EZproxy Abuse Report")
  • Project
  • Username (custom field for blocked user)

and whatever else would be beneficial to have automatically appended to the notifications.

Add endpoint(s) for listing disabled user accounts

Add additional endpoints that can be used to list disabled user accounts:

  • list disabled user accounts
    • user accounts disabled by this application (users.brick-disabled.txt)
    • user accounts disabled manually (e.g., users.disabled.txt)
  • list log messages associated with disabled user accounts

Move embedded template strings to external files

Current templates:

const disabledUsersFileTemplateText string = `
# User "{{ .User }}" from source IP "{{ .IP }}" disabled at "{{ .Time }}" per alert "{{ .AlertName }}" (SearchID: "{{ .SearchID }}")
{{ .User }}::deny
`

// This is a standard message and only indicates that a report was received,
// not that a user was blocked. This message should be followed by another
// message indicating whether the user was blocked or ignored
const reportedUserLogFileTemplateText string = `
{{ .ArrivalTime }} [REPORTED] User "{{ .User }}" from source IP "{{ .IP }}" reported via alert "{{ .AlertName }}" (SearchID: "{{ .SearchID }}")`

// NOTE: time.RFC3339 format is used for fail2ban parsing reliability
const disabledUserLogFileTemplateText string = `
{{ .ArrivalTime }} [DISABLED] User "{{ .User }}" from source IP "{{ .IP }}" disabled due to alert "{{ .AlertName }}" (SearchID: "{{ .SearchID }}")`

// NOTE: time.RFC3339 format is used for fail2ban parsing reliability
// NOTE: This same template is used for ignored users based on presence in the
// ignored users list and the ignored IP Addresses list.
const ignoredUserLogFileTemplateText string = `
{{ .ArrivalTime }} [IGNORED] User "{{ .User }}" from source IP "{{ .IP }}" ignored per entry in "{{ .IgnoredUsersFile }}" (SearchID: "{{ .SearchID }}")`

Since this application doesn't really care too much about the generated file formats (aside from an assumption that the disabled users file has one username per line), sysadmins may benefit from being able to adjust the output format without recompiling and redeploying the application. This could allow local adjustments to better match the expectations of fail2ban (or other tools).

Research using "ezproxy kill" subcommand in place of fail2ban for terminating active sessions

While working on the initial v0.1.0 support for terminating user sessions (which relies upon fail2ban for terminating active sessions), I stumbled across a GitHub project that uses a set of Perl scripts and a crontab entry to parse EZproxy logs and take a set of actions.

Those actions include (there could be others, I didn't dig too deep):

  • disable user account
  • terminate action sessions for disabled user account

The block_user.pl script has this block:

if ($block_session) {
	system("/opt/ezproxy/ezproxy kill $block_session");
}

As noted in greater detail in comment #13 (comment), the subcommand still exists and works in our test environment:

$ sudo ./ezproxy kill
Session must be specified

$ sudo  grep -E '^S ' ezproxy.hst
S SESSION_ID_HERE REDACTED

$ sudo ./ezproxy kill SESSION_ID_HERE
Session SESSION_ID_HERE terminated

If this subcommand isn't deprecated, using it could simplify the requirements for this project considerably and make it much safer to operate.

References:

Decide on a project name

Currently we're using ezproxy-blockuser as the shortname with atc0005/ezproxy-blockuser as the fully-qualified project name.

Before flipping this repo public I plan to:

  1. Have something worth sharing (working prototype)
  2. Add any required and/or suggested trademark notices to make clear this isn't an official project of the EZproxy developers, OCLC and that there is no association between this project and their services/products (aside from complimenting EZproxy server usage)

While there is definitely a strong benefit from associating this application's purpose with EZproxy, this application stands to have a much broader use than that one application. It is probably worth going with a more generic name, both to avoid any unintentional association (with its obvious legal ramifications) and mind-share "lock-in" that could result in having the vendor's product be in this project's name.

As an example of the potential generic nature of this app, let's go with the assumption that we use templates for generating the output flat-file. Using templates,, we could have one template for EZproxy format, another template for another format and so on. This is on top of the support planned for #9 where a user can be added to a LDAP group (which any number of apps could use for their logic/decision making).

The end results is that we achieve our original goal of automatically blocking problematic EZproxy user accounts, but that we also have a project name that doesn't lock us into only that one function.

Consider removing the option to ignore lookup errors (or at least change the scope of the option)

Currently the ignore-lookup-errors flag (and associated config file setting) allows ignoring lookup errors for:

  • disabled user accounts
  • ignored usernames
  • ignored IP Addresses

The latter two make some sense, but are still questionable as the resulting behavior would allow for disabling the associated account and terminating (if enabled) the associated user sessions.

The first seems to blindly assume that the user account has not been disabled yet and will attempt to disable it again.

This "feature" needs further consideration and potential removal to avoid unexpected behavior.

Feature: Add support for "blocking" users by adding them to a user group

EZproxy supports a variety of authentication sources, including flat files and LDAP. You are able to use group membership in LDAP (and presumably other sources) to determine final access, either to the system as a whole or to specific resources.

The primary goal is to add specified user accounts to a flat-file (general use case that should work for everyone), but automatically adding user accounts to a LDAP (or Active Directory) group would provide a central mechanism for blocking/unblocking accounts. This could be useful for a variety of reasons.

Add support for specifying notification settings via flags, config file

As I'm pulling in existing notifications support from the atc0005/bounce project, I'm finding that a large chunk of constants are configured for settings that are probably going to end up being site-specific. For the early release I'm fine with them remaining constants (e.g., "let's just get it working" MVP mindset), but I'd like to loop back and change the current constants to a set of defaults and expose the settings via config flags and file. Due to the sheer number of them they may require a few new config file sections.

Tangent: I'm not yet sure whether exposing 1:1 flags and config file settings is manageable over the long haul, so some of these may end up as config file "tunibles" only.

Add retry support for writing disabled user entries, reported user entries

As of this writing both targets are flat-files, but at some point they could be another medium entirely (e.g, LDAP groups in the case of disabled user entries). At present the attempt is made just once, but for reliability the attempt should probably occur at least twice before logging the failure and moving on.

As I'm drafting a stub logrotate config, the potential for a race condition between logrotate rotating the file and brick attempting to log a disable user action is present and unaccounted for.

Deploy doc lists several of the file creation steps twice

https://github.com/atc0005/brick/blob/v0.1.1/docs/deploy.md#setup-output-directories

This section is intended to cover only setting up directories, but it seems I failed to properly cleanup copy/paste work and left behind steps for pre-creating log files, which are repeated in the section below that is specific to pre-creating log files.

Here is one example (out of several):

  • sudo touch /var/log/brick/users.brick-reported.log
    • pre-creating this file satisfies fail2ban requirements that the file exist before the associated jail is activated

Add a debug endpoint

This endpoint would take no action, but instead echos out a parsed version of a received payload. My recent thinking was that we could exclusively use the atc0005/bounce application for this purpose, but that shows everything received and not the final data structure generated from parsing a received payload.

I'm not sure yet whether this endpoint would need to be protected by a different means than GH-18 is intended to provide.

Add support for geolocation lookups

v0.1.x supplied fail2ban configuration files which provides support for blocking and reporting offending IPs. The report contains geolocation information to help identify where a request is coming from. The assumption is that multiple requests will occur, one per IP Address associated with potential abuse.

We should consider providing this same information directly, either using the same binary that fail2ban uses for the reports or another Go-based library.

How can we programatically terminate a session?

Blocking a user account can be done by adding the account to a specific flat-file referenced by EZproxy, but historically we've had to login to the admin panel to terminate a session. Is there an API we can use? If we can't kill existing sessions, perhaps we can match against an IP Address and block that instead?

References:

config.example.toml has invalid example settings (commented out)

For example:

[ignoredusers]

# Fully-qualified path to a list of user accounts that should not be banned
# by this application. Lines beginning with a '#' character are ignored.
# Leading and trailing whitespace per line is ignored.
# ignored_users_file = "/home/ubuntu/users.brick-ignored.txt"
# file_path = "/tmp/users.brick-ignored.txt"
file_path = "/usr/local/etc/brick/users.brick-ignored.txt"

While it once was a valid config setting, the ignored_users_file is invalid for the current version of the application. Because we're attempting to offer multiple example paths, the fix is to replace the invalid settings name with the correct one: file_path. There is at least one more invalid setting like this one in the config file that I spotted, though there may be others.

Refine (Teams) notifications template

Overview

As of v0.1.2 (and the incoming changes for v0.2.0), the Teams notification template "works", but feels a little repetitive.

As an example, I'm looking over the Teams notifications generated back during a May 22nd demo that I gave and I can see that every pair has a good deal of repetition. I believe that this was by design so that receiving either the initial notification ("Disable user account request received") or the follow-up would quickly tell you what you needed to know right away (top-most details), but contain the same information in the body so that you wouldn't have to go locate the first notification for additional information.

Section breakdown

Main section

  • Title
    • this is intended to allow the tech to quickly/easily glance at the incoming notification and disregard or read further if the event stands out
  • Text
    • this is the main text field for the notification; our most important point to convey needs to be here and formatted in a prominent manner

Disable User Request Annotations

  • this varies depending on the type of event
  • examples
    • Note: disable request received
    • Note: Ignoring disable request from "127.0.0.1:44958" for user "abc0001" from IP "10.10.10.10" due to presence in "/usr/local/etc/brick/ips.brick-ignored.txt" file.
    • Note: Ignoring disable request from "127.0.0.1:44954" for user "abc0009" from IP "9.9.9.9" due to presence in "/usr/local/etc/brick/users.brick-ignored.txt" file.
    • Note: Disabled user "abc0009" from IP "4.4.4.4"
    • Note: Received disable request from "127.0.0.1:44948" for user "abc0009" from IP "9.9.9.9"; username is already disabled

Disable User Request Details

  • Username
  • User IP
  • Alert/Search Name
  • Alert/Search ID

Alert Request Summary

  • Received at
  • Endpoint path
  • HTTP Method
  • Alert Sender IP

Alert Request Headers

  • varies
  • Accept
  • Content-Type
  • Content-Length
  • Expect
  • User-Agent

message trailer/branding

Example:

Message generated by brick (x.y.z) at 2020-05-22T09:25:19-05:00

Problem/pain points

The item which appears most poorly labeled is the Disable User Request Annotations section Text field. The content is good, the leading 'Note:' bit seems out of place.

The main Text field for the message card is repetitive. For example, notifications 1/2 and 2/2 both have the same line:

Disable request received for user abc0001 at IP 192.168.2.3 by alert TEST - Webhook - Echo

This summary is useful, but it may be worth rethinking how it can be more dynamic based on the event decision/outcome. The second copy of that notification contains that line and just below it in the Disable User Request Annotations section this line is present:

Note: Disabled user "abc0001" from IP "192.168.2.3"

That seems more useful to have in the main message card Text field.

Implement support for Graylog JSON payloads

https://docs.graylog.org/en/3.0/pages/streams/alerts.html#notifications

The first thought I have is using a new endpoint for Graylog alert payloads in order to keep them separate from Splunk alert payloads. Not sure whether this is the "best" approach, but it seems like it would be faster to implement.

The other approach that comes to mind is some some of automatic payload detection process. Perhaps concurrently run an Unmarshal attempt against the received payload using all known payload formats and whichever attempt results in a successful unmarshaling (first) is what we use.

That sounds like it would work fine for a system designed to take non-destructive action only (e.g., trigger a warning alert and nothing else), but where we are looking to take automatic action on an alert it seems like taking action on an assumption would be a very bad idea.

Separate endpoints is probably best for now with refactor work done later to unify, if a safe way to do so is found/proved.

Add support for ignoring a list of IP Addresses

See also GH-15. The idea here is the same, but based on source IP Addresses instead of usernames. Depending on library support, the initial implementation might be based solely on individual IP Addresses and not ranges.

Can Splunk provide multiple users per alert, or just one?

According to the Splunk Enterprise Alerting Manual:

Webhook data payload

The webhook POST request's JSON data payload includes the following details.

  • Search ID or SID for the saved search that triggered the alert
  • Link to search results
  • Search owner and app
  • First result row from the triggering search results

If multiple users are eligible to be blocked we'll only have one row and likely only one username to go off of. Otherwise, we'll have to use some sort of custom search results (pretty ignorant of Splunk, so not sure if it's possible) that return multiple usernames.

If we can only receive one user per alert, then I guess we would have to just have the alerts trigger more often (say every 10-15 minutes?) in order to work around that. In the end, this might not be a bad thing as it could help control the damage of erroneous actions on the part of the web app.

Update disabled user entries templating process to force case-folding

Currently the case is preserved for the submitted username. This likely does not affect "is this username already disabled" logic due to forced case-folding, but preserving the case makes for a non-uniform look (less important) and introduces potential confusion for any tech trying to troubleshoot login issues (more important).

Design decisions that should be well documented

What follows is a miscellaneous list of design decisions that should be emphasized in a meaningful way. Many of these points are items I found myself repeating out loud when I struggled to decide how to handle a specific logic path.


  • Errors encountered while checking the ignored Usernames or IP Addresses files results in the disable request being ignored

  • placeholder

Review endpoint response handling, ensure that specific/appropriate content type is set

The application may be unintentionally relying upon automatic detection behavior by the built-in http server. Based on what I've recently heard/read this is not best practice and can lead to various problems.

In our case it's probably not a critical issue as we're only returning status codes and some error text strings (which the built-in http server set as text/plain content type), but it's worth handling in a proper manner.

Document test Splunk alert recipe

Not sure if Splunk or Graylog support remote queries. This might not be within the final scope of this project to have this as a primary feature, but when actively developing the initial version of this application it could be useful to trigger these queries remotely.

Update X of Y notification step labeling to calculate total steps based on enabled features

As of this writing this only applies to Teams notifications, but it's worth thinking about in a larger scope.

Snippet from a switch statement responsible for setting a message card Title field:

case events.ActionSuccessDisableRequestReceived, events.ActionFailureDisableRequestReceived:
	msgCardTitle = msgCardTitlePrefix + "[step 1 of 3] " + record.Action

case events.ActionSuccessDisabledUsername, events.ActionFailureDisabledUsername:
	msgCardTitle = msgCardTitlePrefix + "[step 2 of 3] " + record.Action

case events.ActionSuccessDuplicatedUsername, events.ActionFailureDuplicatedUsername:
	msgCardTitle = msgCardTitlePrefix + "[step 2 of 3] " + record.Action

case events.ActionSuccessIgnoredUsername, events.ActionFailureIgnoredUsername:
	msgCardTitle = msgCardTitlePrefix + "[step 2 of 3] " + record.Action

case events.ActionSuccessIgnoredIPAddress, events.ActionFailureIgnoredIPAddress:
	msgCardTitle = msgCardTitlePrefix + "[step 2 of 3] " + record.Action

case events.ActionSuccessTerminatedUserSession, events.ActionFailureTerminatedUserSession:
	msgCardTitle = msgCardTitlePrefix + "[step 3 of 3] " + record.Action

Ideally the of 3 bit wouldn't be static and would be determined solely by the list of enabled features. As of this issue, this mostly pertains to sessions termination support which can be either enabled or disabled. Additionally, this section is never reached if the reported username (or associated IP Address) is in an "ignored" file (and that file is referenced by the config settings).

Is it even possible to submit an empty string as a config file/CLI-provided value?

While preparing to duplicate an existing validation check like this one the thought hit me: Maybe it's not necessary to guard against empty strings as an input value from a user? It feels "wrong", but perhaps the config packages already have this responsibility?

// Verify that the user did not opt to set an empty string as the value,
// otherwise we fail the config validation by returning an error.
if c.IsSetIgnoredIPAddressesFile() && c.IgnoredIPAddressesFile() == "" {
    return fmt.Errorf("empty path to ignored ip addresses file provided")
}

For now I'm going ahead with duplicating the validation check, but it would be good to learn what the best practice for this is and apply it here (and elsewhere).

Invalid debug text claims case folding has occurred when it hasn't

brick/files/process.go

Lines 302 to 304 in c38c2e5

currentLine = strings.TrimSpace(currentLine)
log.Debugf("Line %d from %q after lowercasing and whitespace removal: %q\n",
lineno, haystack, currentLine)

This text came from an earlier version of the code which did fold the case, likely for later comparison purposes. Need to update or remove that statement so it doesn't (further) confuse readers (particularly a later iteration of myself).

Add terms doc

Goals:

  • List out common phrases/names used throughout the codebase
  • Have one location that serves as a common reference/look-up point

Ex:

  • ActiveFile vs ActiveUsersFile
  • AuditFile vs AuditLogFile
  • ActionLog vs EventLog vs ReportedUsersLog

Add support for ignoring a list of user accounts

Add in support for ignoring a list of user accounts. This would provide a "safe" list so that listed accounts are not subject to block behavior by this application. This could include support team members (e.g., EZproxy stanza maintainers), organizational leadership, etc.

This would initially take the form of a flat-file, but later modifications could include reading from a LDAP/AD group, etc.

This feature will likely be a requirement for the initial launch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.