nicwaller / loglang Goto Github PK
View Code? Open in Web Editor NEWBuild your own event processor in Go using this module
Build your own event processor in Go using this module
Lumberjack (elastic/go-lumber) is a protocol developed by Elastic.
The preferred protocol for moving events between Logstash instances.
And apparently Elastic Beats use Lumberjack v1 or v2.
Spec (v1; 2016) https://github.com/elastic/logstash-forwarder/blob/master/PROTOCOL.md
What kind of I/O?
Performance measurements:
Capacity:
Testing with E2E both enabled and disabled
It may be possible to expose some of these performance numbers through the /metrics
endpoint.
How to decide how many events should be written in a batch to a file, to an S3 object, to Elasticsearch, in a single HTTP POST request, etc.
POST
)It would be great to be interoperable with these AWS systems:
But this probably deserves a separate repo entirely.
Compare the current event with a previous matching event
Identify matching event by match on a field. If matching on multiple fields is desired, use the fingerprint
filter first.
Copy previous event into @previous
could this be used to calculate a "high water mark" and send alerts when that is exceeded?
GET
, POST
)Accept
header to enable compressionContent-Type
hinting to automatically select framing and codecGitHub API โ https://docs.github.com/en/rest/overview/authenticating-to-the-rest-api?apiVersion=2022-11-28
It would be really cool to "tune in" and get an immediate live firehose of all events flowing out of the pipeline.
Doing this with a WebSocket would also be cool.
There's the primary filter chain, and a filter chain for each input.
Each output should also have a filter chain so it's possible to customize for that output.
would be really cool to support websockets so that a browser can "tune in" to a realtime firehose of events. browser should be able to provide a filter that is executed on the server.
Listen for HTTP requests, and reply with recent events.
Two modes:
Should probably allow the client to provide a filter. If filtering, then more modes are useful:
this can be useful for constructing a simple web interface that shows not-quite-realtime view of logs
if paired with a pipeline that does no filtering, this could also be useful as a way to peek at the recent past and see what the raw events looked like
Some inputs don't run continuously; they run on a schedule or on demand or once at startup.
For example, an input that periodically polls an HTTP API. Probably want to use a channel to trigger the input. Then the channel could be fed by a recurring cron
schedule.
Note: It is impossible to run a task every 14 days using cron so other types of recurrence schedules should be supportable.
Or even more interesting, loglang could provide an API that allows on-demand triggering of scheduled inputs. For example, a pipeline that reads from a dead letter queue would only be triggered on-demand.
Heartbeat should use this approach too.
Why not turn this into an RSS reader?
A fingerprint
filter combined with a unique
filter (perhaps using Bloom filters?) would allow identifying new events which could be very useful for alerting.
DSV (Delimiter-Separated Values) is mostly known as CSV and TSV for commas and tabs respectively.
Rows of tabular data can be interpreted as events by combining the header with the value. But because the header is stored outside the value, this complicates the framing pattern.
The Prometheus metrics format looks like this:
http_request_duration_seconds_bucket{le="1"} 133988
http_request_duration_seconds_bucket{le="+Inf"} 144320
http_request_duration_seconds_sum 53423
http_request_duration_seconds_count 144320
In combination with an HTTP fetch input type, this could be used to generate events from Prometheus-capable endpoints.
it should be possible to read events from process standard input
and loglang should exit with status 0 when standard input is closed
no E2E acknowledgement is needed
remember to populate ECS schema fields like hostname
this should be easy
try to respect a batching strategy, while respecting that the max UDP datagram size is 65,515 bytes
this should be easy to implement
exec()
a local processuse cases:
maybe events should have a separate store of metadata that doesn't get sent by outputs
Logstash uses the @metadata
field, but it would be fine to have a separate field in the Event struct
If doing RELP input #24 then should do output as well.
Git is super interesting! New branches, new commits, new tags can all be interpreted as events. The reflog (reference log) will probably be important here.
This could be very interesting:
RELP (the Reliable Event Logging Protocol) was proposed by Rainer Gerhards, the lead developer of rsyslog, in 2008.
Compared to plain syslog, RELP allows to receiver to send acknowledgements confirming the message was received.
Specification: https://github.com/rsyslog/librelp/blob/master/doc/relp.html
Mailing List: https://lists.adiscon.net/mailman/listinfo/relp (requires membership)
Implementation: https://github.com/rsyslog/librelp
Pipelining is a key feature (client can send multiple requests without waiting for first response). Responses must be sent by the server in the exact same order as commands where received
Version 1.1 adds support for TLS using STARTTLS
.
If kv style can be customized with other delimiters then it should be compatible with LTSV (labelled tab-separated values).
STOMP (Streaming Text Orientated Messaging Protocol) provides an interoperable wire format so that STOMP clients can communicate with any STOMP message broker to provide easy and widespread messaging interoperability among many languages, platforms and brokers.
https://archivedocs.graylog.org/en/latest/pages/gelf.html
probably needs to be a dedicated input type
Elasticsearch is strict about field types within a given index. If you try to add two documents to the same index, like this:
{"status": 200}
{"status": "OK"}
Elasticsearch will refuse to index the second document, and if you're using Logstash that failure is silent. ๐ฑ
There are several things that Loglang could do to prepare output for Elasticsearch:
exec()
a local processRedis is very cool and it would be great to support it.
But not as part of the core suite; we should use an existing Redis module for this.
Unix domain sockets cannot be read like regular files, so a special input plugin is needed.
host.name
and file.path
(even though it's not really a file) and network.transport
= uds
(unix domain socket)Unix sockets are reliable. If the reader doesn't read, the writer blocks. If the socket is a datagram socket, each write is paired with a read. If the socket is a stream socket, the kernel may buffer some bytes between the writer and the reader, but when the buffer is full, the writer will block. Data is never discarded, except for buffered data if the reader closes the connection before reading the buffer.
Unix domain sockets are used by traditional syslog and systemd. Supporting socket input would enable direct replacement of rsyslogd.
The GNU C Library provides functions to submit messages to Syslog. They do it by writing to the /dev/log socket. See Submitting Syslog Messages.
Source: https://www.gnu.org/software/libc/manual/html_node/Overview-of-Syslog.html
~ $ ls -lac /dev/log /run/systemd/journal/dev-log
lrwxrwxrwx 1 root root 28 Dec 22 2022 /dev/log -> /run/systemd/journal/dev-log
srw-rw-rw- 1 root root 0 Dec 22 2022 /run/systemd/journal/dev-log
Apparently Docker also uses unix sockets.
Run netstat -a -p --unix
to see all unix sockets on the local system.
To support the ECS schema field event.original
it might be worth storing the original bytes (after framing, before codec) and providing an option to automatically include that on each output.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.