mastodon / flodgatt Goto Github PK

View Code? Open in Web Editor NEW

87.0 11.0 7.0 796 KB

A blazingly fast drop-in replacement for the Mastodon streaming API server

License: GNU Affero General Public License v3.0

Rust 99.75% Shell 0.25%

mastodon rust

flodgatt's Introduction

Flóðgátt

A blazingly fast drop-in replacement for the Mastodon streaming API server.

Current status: This server is currently a work in progress. However, it is now testable and, if configured properly, would theoretically be usable in production—though production use is not advisable until we have completed further testing. I would greatly appreciate any testing, bug reports, or other feedback you could provide.

Installation

Starting from version 0.3, Flóðgátt can be installed for Linux by installing the pre-built binaries released on GitHub. Simply download the binary (extracting it if necessary), set it to executable (chmod +x) and run it. Note that you will likely need to configure the Postgres connection before you can successfully connect.

Configuration Examples

If you are running Mastodon with its standard Development settings, then you should be able to run flodgatt without any configuration. (You will, of course, need to ensure that the Node streaming server is not running at the same time as Flodgatt. If you normally run the development servers with foreman start, you should edit the Procfile.dev file to remove the line that starts the Node server. To run flodgatt with a production instance of Mastodon, you should ensure that the mastodon-streaming systemd service is not running.)

You will likely wish to use the environmental variable RUST_LOG=warn to enable debugging warnings.

If you are running Mastodon with its standard Production settings and connect to Postgres with the Ident authentication method, then you can use the following procedure to launch Flodgatt.

Change to the user that satisfies the Ident requirement (typically "mastodon" with default settints). For example: su mastodon
Use environmental variables to set the user, database, and host names. For example: DB_NAME="mastodon_production" DB_USER="mastodon" DB_HOST="/var/run/postgresql" RUST_LOG=warn flodgatt

If you have any difficulty connecting, note that, if run with RUST_LOG=warn Flodgatt will print both the environmental variables it received and the parsed configuration variables it generated from those environmental variables. You can use this info to debug the connection.

Flóðgátt is tested against the default Mastodon nginx config and treats that as the known-good configuration.

Advanced Configuration

The streaming server will eventually use the same environment variables as the rest of Mastodon, and currently uses a subset of those variables. Supported variables are listed in /src/config.rs. You can provide any supported environmental variable to Flóðgátt at runtime or through a .env file.

Note that the default values for the postgres connection do not correspond to those typically used in production. Thus, you will need to configure the connection either env vars or a .env file if you intend to connect Flóðgátt to a production database.

If you set the SOCKET environmental variable, you must set the nginx proxy_pass variable to the same socket (with the file prefixed by http://unix:).

Additionally, note that connecting Flóðgátt to Postgres with the ident method requires running Flóðgátt as the user who owns the mastodon database (typically mastodon).

Building from source

Installing from source requires the Rust toolchain. Clone this repository and run cargo build (to build the server), or cargo build --release (to build the server with release optimizations).

Running the built server

You can run the server with cargo run. Alternatively, if you built the sever using cargo build or cargo build --release, you can run the executable produced in the target/build/debug folder or the target/build/release folder.

Building documentation

Build documentation with cargo doc --open, which will build the Markdown docs and open them in your browser. Please consult those docs for a detailed description of the code structure/organization. The documentation also contains additional notes about data flow and options for configuration.

Testing

You can run basic unit tests with cargo test.

Manual testing

Once the streaming server is running, you can also test it manually. You can test it using a browser connected to the relevant Mastodon development server. Or you can test the SSE endpoints with curl, PostMan, or any other HTTP client. Similarly, you can test the WebSocket endpoints with websocat or any other WebSocket client.

Memory/CPU usage

Note that memory usage is higher when running the development version of the streaming server (the one generated with cargo run or cargo build). If you are interested in measuring RAM or CPU usage, you should likely run cargo build --release and test the release version of the executable.

Load testing

I have not yet found a good way to test the streaming server under load. I have experimented with using artillery or other load-testing utilities. However, every utility I am familiar with or have found is built around either HTTP requests or WebSocket connections in which the client sends messages. I have not found a good solution to test receiving SSEs or WebSocket connections where the client does not transmit data after establishing the connection. If you are aware of a good way to do load testing, please let me know.

Contributing

Issues and pull requests are welcome. Flóðgátt is governed by the same Code of Conduct as Mastodon as a whole.

flodgatt's People

Contributors

Stargazers

Watchers

Forkers

mandark321 forkkit rustyforks toidiu stjordanis seanpm2001 bhushan-ucsf

flodgatt's Issues

Improve error handling

Flodgatt currently panics on certain fatal errors (e.g., when Postgres isn't running). Since these errors are fatal, ending the process with an error message is appropriate. However, we should display the error message with slightly better formatting than is provided with panic and clarify to the user that this failure condition was expected (rather than an unanticipated bug)

Stream multiplexing (API change)

A planned development in upstream Mastodon is to multiplex multiple streams through a single connection. For example, if a user has 10 columns with different hashtags in their web UI, that's currently 10 separate websocket connections. That's a waste of resources. Instead, the dynamic nature of websockets could be used to subscribe and unsubscribe from specific streams. The return format would then need to be enhanced to say which stream a given message belongs to.

DB_PORT configuration variable bug

The DB_PORT configuration variable (see #23) is not working in all instances, forcing users to rely on the DATABASE_URL variable instead.

Out-of-order posts on the federated timeline

Currently, boosts are displaying on the federated timeline, which is incorrect behavior. This is leading to the appearance of out-of-order toots on the federated timeline (because the UI does not indicate that the post is a boost, the post looks like a toot from hours ago just now showing up).

(I have not yet determined whether this impacts the local timeline as well.)

Postgres connection incorrectly using prepared statements

The Postgres connection is not currently coded to use prepared statements (that is, it doesn't use the prepare method. However, it apparently is still using them under the hood at some point. Prepared statements do not work with PGbouncer, which leads to errors such as:

thread 'tokio-runtime-worker-6' panicked at
'Hard-coded query will return Some([0 or more rows]): Error {
    kind: Db, cause: Some(DbError {
        severity: "ERROR",
        parsed_severity: Some(Error),
        code: SqlState("26000"),
        message: "prepared statement \"s145\" does not exist",
        detail: None,
        hint: None,
        position: None,
        where_: None,
        schema: None,
        table: None,
        column: None,
        datatype: None,
        constraint: None,
        file: Some("prepare.c"),
        line: Some(505),
        routine: Some("FetchPreparedStatement") }) }', 
src/parse_client_request/postgres.rs:33:26

Time-series logging

Neither Flodgatt nor the JavaScript server it's replacing currently have a great story for logging per-websocket message. We can log information and do in certain test scenarios – however, the number of concurrent websocket connections in production quickly makes that sort of logging impractical. Plus, what we would really like to track are averages across many websocket connections, not the numbers for one particular connection.

Fortunately, solving both of these problems is exactly what statsd/etc and and time-series databases are designed for. Post 1.0.0, we should consider adding support for logging in a compatible format.

Create binary release [update: staticly linked]

Add auto ping to websockets implementation

The Node streaming server sends a ping via WebSockets every 30 seconds, which keeps the connection alive. Flodgatt currently does not—it listens for pings from the client and would respond to any pings with pongs. However, the clients do not send pings (they also wait and respond with pongs).

Accordingly, we should update Flodgatt to send pings. However, warp does not currently support sending pings through its built-in websocket implementation. I have opened an issue and will see what I can do to get that feature added.

If we can't add it to Warp, there is probably a work-around (most obviously, we could send an empty binary message as a ping; we could also re-implement the WS support if it comes to that). But it seems worth seeing what the response from upstream is first.

DMs don't seem to be broadcast

I was on the Direct Message page, and sent a DM.. I had to refresh for it to show up on the page

Roadmap to v1.0

After #12, the streaming server's Sever Sent Events functionality is feature-complete. This means that we're getting fairly close to full functionality. I still need to implement the following features (in rough order):

Add unit tests. Warp has strong support for unit testing, so this should not take that long.
Read development/production configuration variables in place of current hard-coded defaults. This also shouldn't take all that long.
Add WebSocket support. This is the point where I'm most uncertain. Warp also has good support for WebSockets, so I think/hope that adding WS support might be as easy as adding several filters and then hooking them in to the existing streams. On the other hand, it is possible that it will require setting up separate streams; I'll need to dig into the docs/examples a bit more to be sure. So this could be fairly fast or could be a bit more time consuming.

Once I've added these features, we should be feature-complete and ready for integration testing with Mastodon proper. I am very much hoping to get to this point at or before May 10 because I will be traveling May 10–22 and will have limited availability. My goal is to have the streaming server testable before I leave.

After the server is feature complete, we'll have the following tasks:

Perform integration tests with Mastodon
Benchmark the server's performance both in terms of RAM usage and ability to handle concurrent connections. @Gargron, I would be interested to hear how you benchmarked the existing streaming server. Most of the load-testing tools I'm familiar with test time to complete a connection and thus, since SSE connections don't complete, don't show very useful data for either the Node or Rust streaming server. I believe there's better support for load-testing WebSocket connections, though I haven't looked into that as much.
Make any changes necessary to improve performance or address issues found during testing.
Finalize documentation for public release

Given all of the above and my travel in mid-May, it seems like a release window of late May/early June is a realistic target.

Health check API is missing

localhost:4000/api/v1/streaming/health

this is used by the docker-compose and my external monitoring tool.

Add benchmarks

In support of #32, it would be helpful to add in some benchmarks to determine where our highest CPU usage is coming from.

Handle `send_error`s gracefully

Flodgatt currently panics on send errors, which was appropriate during early development (to fail fast and discover errors quickly). However, we should now add more robust error handling to gracefully handle send errors without crashing.

Add postres connection pool

We should add a connection pool, probably with r2d2-postgres.

Language is not in list `{"en"} on 0.6.0

INFO flodgatt::redis_to_client_stream::message > Language no is not in list {"en"}
INFO flodgatt::redis_to_client_stream::message > Language ja is not in list {"en"}

Add additional configuration variables

We should configuration options to match the configuration options in tootsuite/mastodon, as described in the documentation.

Specifically, we should add the following variables:

PostgreSQL

Redis

Deployment

Are there any others that we should also add?

[EDIT: We should also remove .env from version control (though we could have a example config). When doing so, it might also make sense to switch from pretty_env_logger to env_logger, which has essentially the same API and would let us set the default log level without an env variable.]

breaking: muted accounts showing up in local/federated feeds

I've had multiple people confirm this on my instance.

Fix `{}` keepalives

Since #78, Flodgatt has sent an keepalive websocket message consisting of {} to inform the client that the connection is still alive. This is perfectly functional, but is slightly ugly when debugging: when looking at the message view of a websocket connection, the actual messages are mixed in with many pings as shown below

This is not much of an issue, even when debugging: it only shows up when you've selected the message pane of the relevant websocket request. This means it doesn't clutter the console or interfere with debugging any non-flodgatt features. On the other hand, it will be nice to get rid of the clutter when we can.

Fortunately, the issue has been fixed upstream, in Warp 0.2.2. Unfortunately, we're still on the last 0.1.x build of warp. Once we upgrade and deal with breaking changes (#75), should be be able to solve this issue.

Add SSL support for Posgtgres

After #36, Flodgatt now supports an env variable that specifies whether to connect to Postgres via SSL. However, it does not yet support connecting to Postgres with SSL without manual configuration. We should add that support.

Upgrade dependencies

Flodgatt is currently using some older versions of its dependencies, as specified in its Cargo.lock file and does not currently build with more recent versions of those dependencies.

This is not a huge issue—since Flodgatt is a binary application rather than a library, it is appropriate to commit the Cargo.lock file to git and for everyone (including TravisCI) to build Flodgatt with the exact versions specified in the lockfile.

Nevertheless, incompatibilities with more up-to-date versions of dependencies prevent us from updating those dependencies and mean that we can't benefit from future upstream performance or security fixes.

Error in parsing `DATABASE_URL`

Connecting to Postgres works with DB_HOST but not with DATABASE_URL, indicating a parsing error.

Additional runtime env / debug information

Can you please add WARN debug information for every known ENV value on boot?

I'd like to know exactly what ENVs are being registered internally.

Refactor tests as pure unit tests

To support contentious integration testing with TravisCI, we should have pure unit tests that do not expect a functioning Postgres database.

This means adding in some mocks for the database. I believe double is probably the best mocking library to use with flodgatt, but it might be worth investigating others/I'd be open to other views.

Handle messages with non-conformant payloads

Flodgatt currently (since #108) type-checks the payloads of incoming messages and drops any that do not conform to Mastodon API as described in its API documentation. This provided several advantages, but also meant that changes to the API would require patching Flodgatt and that any undocumented changes to the API (or messages that did not conform to the documented API) would get dropped even if the client could handle them.

After discussing the issue in the developer Discord, we concluded that it would be better to send these messages on if they have enough data for us to successfully handle them. We may need to do so by not type-checking much of the payload at all, or we may be able to type check it and log a warning for non-conformat messages. Logging a warning would be nice – at least we'd know that there's an issue. But I want to ensure that there's no significant runtime cost to doing so.

If we do have the ability to log warnings for non-conformant messages, we can provide the ability to turn that feature off at compile time for users who are not interested in that sort of logging – a compile-time option would avoid even tiny runtime costs.

Improve end-user documentation/deployment

Before releasing version 1.0.0, we should ensure that we have good documentation on how sysadmins can deploy and configure Flodgatt. I believe we'll need:

A revised README
cargo doc generated documentation on all runtime options (this could live in the README, but keeping the docs with the code is best practice and will help ensure that there isn't any doc drift.
Updates to the Mastodon docs on the streaming API and on installing Mastodon.
A docker-compose file (maybe we can just use the one in mastodon/mastodon#13286?)
A systemd unit file.
An announcement blog post

For the blog post, I'm not sure if it makes sense to post something to the Mastodon blog; if not, I'll at least post something to my personal blog. Either way, it would be great to have some before-and-after numbers to show a nice comparison. The Internet loves a good benchmark and a good rewrite-it-in-Rust story. Since this is both, it might be able to get Mastodon another burst of attention.

Address high CPU usage

Flodgatt is currently using far more more CPU that it should. It seems like most of the high CPU usage is coming from a single process, which indicates that it's the Receiver that's responsible for the slowdown.

I have a few ideas that might help:

Reduce the number of reads/writes in each poll. Most notably, we're currently tracking the time of each poll, which requires checking the current time, and then writing it to the data structure. We could do that less frequently or, maybe, not at all.
Eliminate logging of the MsgQueue's length. This is an O(n) operation and happens pretty frequently (and isn't even displayed by default). It was mostly in for debugging purposes, but I think it has served its function. (Other logging might be in similar category).
Eliminate the use of regex when polling Redis. They're helpful, but we know enough about the shape of the data we get that we can probably do the same thing with less computational expense.
Consider not parsing the JSON at all, just passing it along as text. This gives up some type safety and convenience, but would also lower computational expense.

Or, of course, there's also the option of reducing polling times until we reduce the CPU used sufficiently, but that's something of a brute-force last resort.

(Adding in a connection pool (#31) might also help reduce CPU usage, but that would have more impact on the sub-processes, which currently seem like less of an issue)

[Blocker] Async pub/sub support for redis-rs

Regardless of how #2 is resolved, we will need to be able to asynchronously interact with redis pub-sub channels. Unfortunately, the main crate implementing a Rust API for Redis (redis-rs) doesn't support async interactions with pub-sub channels.

One potential solution—and my first instinct—was to also use the redis-async crate. (I noticed that @sphinxc0re added redis-async to our dependencies, so it's possible he was thinking along the same lines). However, after looking into the matter more closely, I noticed that redis-async isn't that widely used or actively maintained, so I'm reluctant to rely on it for such a crucial piece of the streaming server.

Given that, I'm inclined to at least look into adding support for async pub-sub channels to redis-rs. It looks like several other people are also interested in that support, and at least two have volunteered to help with a PR adding support. I intend to take a look at the code and see how large a lift adding that support would be and, potentially, to send a PR upstream.

Improve security to match Node.js version

We need to update the Oauth flow to match the improvements made in mastodon/mastodon#10818

This updates the access control flow to keep sensitive information out of the query string (which protects against situations where the query string is recorded/logged)

[Architecture decision] change from actix to warp

I've been giving a lot of thought to the project's overall architecture over the past week and am strongly considering changing from the Actix-web framework to the Warp framework. There are three main reasons I think this switch makes sense:

Actix uses an actor model for managing concurrency (similar to Erlang or Elixir). That's a very powerful model, but it doesn't mix well with other models for concurrency—it wants everything to be an actor. And, since we're using Redis, we'll need to interact with Redis's streams. Given that, it seems like a framework that plays a bit better with streams would be helpful.
Related to the above, the change to Warp will likely make the codebase more accessible. The actor model has many fans, but I think it's fair to say that it's less widely used than other models for managing concurrency. Further, Warp's syntax tends to be less verbose and (IMO) more readable for people who don't code in Rust as often, which should help make the codebase more maintainable. (See proof of concept below.)
The Warp documentation is more complete. Actix has the start of some very good documentation, but many of the crucial pages—especially those relating to the details we'd need to manage to merge the concurrency styles—are very much a work in progress (as in, their current text is just "[WIP]")

There are a few downsides to switching frameworks. Most obviously, it means giving up some of the code that's already been written. But, if we're going to switch, it's hard to find a better time than when the codebase only has 229 lines of code. Second, Warp isn't as old or as widely used as Actix (for example, it has 862 GitHub stars compared to 3,418 for Actix-web). On the other hand, Warp is built on top of Hyper (a lower-level Rust HTTP framework with over 4,400 GitHub stars) and is maintained by Sean McArthur—the same developer who built Hyper. So I'm not to worried about Warp's quality even if it is a bit newer.

Based on the above, I think it makes sense to change to Warp. I put together a proof-of-concept that shows how a Server Sent Events stream from Redis could work in Warp; as you can see, it's fairly concise:

use futures::{Async, Poll};
use redis;
use std::time::Duration;
use warp::{path, Filter, Stream};

struct OutputStream {
    con: redis::Connection,
}

impl OutputStream {
    fn new() -> Self {
        let client = redis::Client::open("redis://127.0.0.1:6379").unwrap();
        let con = client.get_connection().unwrap();
        OutputStream { con }
    }
}

impl Stream for OutputStream {
    type Item = String;

    type Error = std::io::Error;

    fn poll(&mut self) -> Poll<Option<String>, std::io::Error> {
        let mut pubsub = self.con.as_pubsub();
        pubsub.subscribe("timeline:1").unwrap();
        pubsub.set_read_timeout(Some(Duration::new(1, 0))).unwrap();
        match pubsub.get_message() {
            Err(_) => Ok(Async::NotReady),
            Ok(msg) => Ok(Async::Ready(Some(msg.get_payload().unwrap()))),
        }
    }
}

fn main() {
    let routes = warp::path!("api" / "v1" / "streaming" / "public")
        .and(warp::sse())
        .map(|sse: warp::sse::Sse| {
            let stream = OutputStream::new().inspect(|_| {});
            sse.reply(warp::sse::keep(
                stream.map(|s| warp::sse::data(s)),
                Some(Duration::new(1, 0)),
            ))
        });

    warp::serve(routes).run(([127, 0, 0, 1], 3030));
}

Unless anyone raises strong objections, I plan to start converting the code to use Warp next week.

Outage after 48 hours on 0.6.5

This service was still running, but was not serving any content.. I just restarted the service and it came back online.

Highlight.

�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'called `Result::unwrap()` on an `Err` value: Error("invalid type: map, expected a string", line: 0, column: 0)', src/redis_to_client_stream/redis/redis_msg.rs:53:36
�[36mflodgatt_1  |�[0m note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-1' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-2' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'ClientAgent: No other thread panic: "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:93:32
�INFO  flodgatt         > Incoming websocket request for Timeline(User(76603), Federated, All)
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'No thread panic (stream.rs): "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:72:28
�INFO  flodgatt         > Incoming websocket request for Timeline(Hashtag(4499), Federated, All)
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'No thread panic (stream.rs): "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:72:28
�INFO  flodgatt         > Incoming websocket request for Timeline(User(6351), Federated, All)
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'No thread panic (stream.rs): "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:72:28
�INFO  flodgatt         > Incoming websocket request for Timeline(Public, Local, All)
�[36mflodgatt_1  |�[0m thread 'tokio-runtime-worker-0' panicked at 'No thread panic (stream.rs): "PoisonError { inner: .. }"', src/redis_to_client_stream/client_agent.rs:72:28
<Loop above forever>

I can RUST_BACKTRACE=1 if you need.

Typos: "Valid lanugage" and mastdon

originating Invariant Violation: JSON value does not conform to Mastdon API

Add support for Unix sockets

Warp supports using Unix sockets (see seanmonstar/warp#101) but I do not fully understand the API. Once I get my head around that, we should add that support.

Status of tasks for initial release

This is a tracking issue to sort current tasks into those that must be accomplished before initial release, those that can be completed after the initial release, and those that I am not sure whether they are required for initial release.

Pre-release tasks

Complete refactor for type safety in config (follow-up to #63)
Unix socket support (#60)
Add postgres connection pool (#31)
Add healthcheck API (#67)
Fix boosts-on-federated-timeline bug (#83)

Post-release tasks

Update dependencies to unblock future performance, stability, and security fixes
Improve test coverage (#61, #42)
Improve error handling (#55)
Create statically-linked binary (#39)
Consider using async-await once it has stabilized

Undetermined

Add SSL support for Postgres (#41)
Add WHITELIST_MODE support (#40)
Add configuration variable for TRUSTED_PROXY_IP (#23)

@Gargron, any thoughts on whether any of the items in "Undetermined" need to be completed before release?

Improve API/contributor documentation

Once we hit 1.0.0 and have updated our dependencies to take advantage of Rust's new async/await syntax, we should make sure the internal API documentation is in good enough shape that new contributors can understand the code and quickly get up to speed. We already have pretty extensive documentation, but I think taking another pass to make sure everything is up-to-date will make sense at that point.

Panic on Following locked account

Easily replicatable.. try to follow a locked account and boom

�[36mflodgatt_1 |�[0m thread 'tokio-runtime-worker-0' panicked at 'called Result::unwrap()on anErrvalue: Error("unknown variantfollow_request, expected one of follow, mention, reblog, favourite, poll", line: 0, column: 0)', src/redis_to_client_stream/redis/redis_msg.rs:53:36

lol.. pointless backtrace

Type safety for hostnames

As of #71, Flodgatt allows the user to supply an arbitrary string as the REDIS_HOST or DB_HOST and then will attempt to parse these as hostnames or IP addresses. However, for the BIND env variable (for its own address) Flodgatt only accepts a valid IPv4 address or the string "localhost".

We should probably remove this inconsistency, perhaps by type-checking all three them via ToSocketAddrs.

filters_changed event not handled correctly

Floodgatt does not correctly parse the filters_changed event when Redis emits it (Flodgatt currently expects all events coming from Redis to have a payload, but the filters_changed event does not).

Add full unit tests

Before releasing version 1.0, we should add unit tests back in. At a minimum, we should re-add the tests that I recently temporarily disabled (which will mean updating the relevant type signatures).

In an ideal world, we should add some additional tests for the various run-time options/configuration (and maybe for a few other areas as well).

Public messages not properly displayed for multiple accounts

Updates to the public TLs are not being properly kept in the send-queue for users who have not yet been sent the message. This needs to be fixed prerelease (#65)

Tracking issue for rc1

As noted on the issues page, there are several auxiliary issues (mostly docs and tests) that I would like to get done before we actually launch 1.0.0. However, I am not currently aware of any technical issues that would prevent us from announcing a release candidate while finalize those auxiliary matters.

It makes sense to give the current level of testing at least a bit more time before we do so, and this issue can track any blockers that come up. But, if the current testing doesn't expose any significant issues, it may make sense to publish an rc1 release soon.

Add `WHITELIST_MODE`

To conform to upstream (mastodon/mastodon#11291), we should add a WHITELIST_MODE that limits access to all endpoints (even public ones) to authenticated users.

Project name change

Candidates: tootstrom or námskeiðið

The latter sounding a bit more serious.

High iowait time/CPU use

Flodgatt currently has spikes in CPU time attributable to iowait that occur roughly 30-45 seconds apart. These need to be resolved.

Publish to crates.io (if we want to)

If we're planning to publish Flodgatt to crates.io (as is typical for FOSS written in Rust), we should probably do so before 1.0.0 so that we can mention cargo install as an installation option. Publishing to crates.io isn't essential for a binary application – it's not like Flodgatt is a library crate that other Rust programs will have as a dependency.

On the other hand, it does have a few minor advantages: it lets developers who already have the Rust toolchain installed easily install Flodgatt with cargo install; it provides automatic hosting for the API docs in the format Rust developers are used to; and it will generally make it easier for new contributors to quickly get started. (Also, though probably less importantly, it will make sure that new versions of Rust are tested against Flodgatt, which protects us against the very unlikely eventuality of breaking changes in Rust).

Consider migrating to Rocket.rs

As @sphinxc0re mentioned in #4, we could consider migrating the codebase to SergioBenitez/Rocket. This would have several advantages:

Popular Rust web framework
Potentially more familiar syntax for developers familiar with other languages/frameworks
On path to running on stable Rust
Uses latest libraries (e.g., futures 0.3.x)
Will use async/await once they land

It would also have several cons:

Not yet running on stable Rust
Slower (at least last time I looked into the mater)
No built-in support for Server Sent Events (but see Support for Server Sent Events · Issue #33 · SergioBenitez/Rocket)
Less functional code (this is a personal preference, but I just really like the way Rocket handles its code flow—the idea of using filters is very functional and strikes me as very clean)
Requires refactoring/rewriting existing code.

Overall, I have a strong preference for staying on stable Rust for an app that will run in production. So I currently plan to stick with Warp. But I'm certainly open to changing my mind if anyone convincingly argues in favor of Rocket (or a different framework). I'm also open to revisiting the issue if/when Rocket lands on stable Rust.

JSON deserialization uses two passes when one would be better

We currently parse the JSON we get from Redis in two passes: when we first get it from Redis, we deserialize it into a struct that looks like

{   event: String,
    payload: serde_json::Value }

Later, we process the payload into a Status/Notification/etc.

My thinking in splitting up the processing like this was to do less work in the first step. That step takes place while we are polling Redis, and thus prevents other threads from polling – so anything we can do to speed up that first step improves performance (even if it requires more work later, in a non-blocking context).

However, after performance testing benchmarking with criterion, I have determined that splitting up the processing like this actually makes the first step slower rather than faster. (This makes a certain amount of sense after thinking about it: when deserializing into a Value, serde has less information about the shape of our result and thus must attempt to deserialize into more types. The extra information we provide serde makes up for any extra work it does.)

Thus, we can serialize our type only once, and both improve performance and give us more type safety/better semantics for the first stages of our code. This will require a change similar in spirit to #93, but will involve much less of a change to Flodgatt's overall structure.

Technically, this could be done post-rc1. It's not adding any features (just changing the internal structure). That said, it's a large enough refactor that I'd like to get it in before rc1 just to avoid any risk of regressions.

Add tests for ClientAgent and Receiver

After #38, Flodgatt has comprehensive tests for parsing incoming requests. However, it does not yet have tests for correctly handling those requests. This is a bit more complicated to test—the functions involved are less pure, for one thing—but we should also add tests for that half of the functionality.

document known good nginx config

I've tried to just swap out the streaming node for flodgatt but nginx seems to have a problem accessing flodgatt with the same reverse proxy config.

I need to confirm why regular streaming works but flodgatt is broken as below.

looks fine inside the docker container.... Can you please add additional debug information for what port is being bound.

Postgres host not parsed correctly

When using DATABASE_URL with a host that is a private IP, e.g. 10.10.10.10, it doesn't seem to parse it correctly, leading to Connection refused errors.

Lots of WS sessions dropping with 1005 (No Status Received)

This seems to be a new thing

three of the user feeds are 1005