Git Product home page Git Product logo

services-flake's Issues

Postgres: use `psql` instead of `postgres` CLI to load the `initialScript`

SQL dumps can have multi-line queries, for instance:

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  name VARCHAR(50) NOT NULL,
  email VARCHAR(50) NOT NULL UNIQUE
);

In such a case using postgres -E as here will error out because the command expects single line statements.

The solution to which can be postgres -j -E, but I think it is ideal to load the dumps with psql command instead -- as suggested in the postgres official doc.

Proposed solution

Context: The reason for using postgres is that we won't have the postgres server running while the init script is running.

We can run initialScript.before using postgres (as init script, before starting the server) and have initialScript.after run with psql as a process dependent on the postgres server.

Ideally, to keep things uniform, I would prefer running all init SQL commands via psql.

postgres: setting superuser to value different than current user causes setupInitialDatabases to fail

to reproduce

from the root of this repo:

  1. check out https://github.com/rhinofi/services-flake/tree/postgres-superuser-issue (see diff from current main)
  2. cd example
  3. PC_DISABLE_TUI=true nix run

the output will be something like:

warning: Git tree '/code/services-flake' is dirty
[pg1-init	] The files belonging to this database system will be owned by user "adrian".
[pg1-init	] This user must also own the server process.
[pg1-init	] 
[pg1-init	] The database cluster will be initialized with locale "C".
[pg1-init	] The default text search configuration will be set to "english".
[pg1-init	] 
[pg1-init	] Data page checksums are disabled.
[pg1-init	] 
[pg1-init	] creating directory ./data/pg1 ... ok
[pg1-init	] creating subdirectories ... ok
[pg1-init	] selecting dynamic shared memory implementation ... posix
[pg1-init	] selecting default max_connections ... 100
[pg1-init	] selecting default shared_buffers ... 128MB
[pg1-init	] selecting default time zone ... Europe/London
[pg1-init	] creating configuration files ... ok
[pg1-init	] running bootstrap script ... ok
[pg1-init	] performing post-bootstrap initialization ... ok
[pg1-init	] syncing data to disk ... ok
[pg1-init	] 
[pg1-init	] 
[pg1-init	] Success. You can now start the database server using:
[pg1-init	] 
[pg1-init	]     /nix/store/ki3srrjjzqalvh0hd9lmqavp5v9wr9jp-postgresql-14.9/bin/pg_ctl -D ./data/pg1 -l logfile start
[pg1-init	] 
[pg1-init	] initdb: warning: enabling "trust" authentication for local connections
[pg1-init	] You can change this by editing pg_hba.conf or using the option -A, or
[pg1-init	] --auth-local and --auth-host, the next time you run initdb.
[pg1-init	] 
[pg1-init	] PostgreSQL initdb process complete.
[pg1-init	] 
[pg1-init	] Setting up postgresql.conf
[pg1-init	] 
[pg1-init	] PostgreSQL is setting up the initial database.
[pg1-init	] 
[pg1-init	] waiting for server to start....2024-02-21 16:15:28.951 GMT [668791] LOG:  starting PostgreSQL 14.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.3.0, 64-bit
[pg1-init	] 2024-02-21 16:15:28.953 GMT [668791] LOG:  listening on Unix socket "/code/services-flake/example/data/pg1/pg-init-k2kvQC/.s.PGSQL.5432"
[pg1-init	] 2024-02-21 16:15:28.960 GMT [668792] LOG:  database system was shut down at 2024-02-21 16:15:28 GMT
[pg1-init	] 2024-02-21 16:15:28.963 GMT [668791] LOG:  database system is ready to accept connections
[pg1-init	]  done
[pg1-init	] server started
[pg1-init	] Checking presence of database: sample
[pg1-init	] psql: error: connection to server on socket "/code/services-flake/example/data/pg1/pg-init-k2kvQC/.s.PGSQL.5432" failed: FATAL:  role "adrian" does not exist
[pg1-init	] 2024-02-21 16:15:29.048 GMT [668803] FATAL:  role "adrian" does not exist
[pg1-init	] 0
[pg1-init	] Creating database: sample
[pg1-init	] 2024-02-21 16:15:29.056 GMT [668806] FATAL:  role "adrian" does not exist
[pg1-init	] psql: error: connection to server on socket "/code/services-flake/example/data/pg1/pg-init-k2kvQC/.s.PGSQL.5432" failed: FATAL:  role "adrian" does not exist

Initial release

We are on our way to making an initial release and doing formal announcement.

Checklist:

  • Finish documentation (sufficiently)
  • Create release tag, with empty CHANGELOG.md
  • Blog post
  • Announce.

After this release, we will record all changes in the CHANGELOG.md

Add Weaviate service

For AI projects in Juspay, we are using a vector database called Weaviate. Need to add support for this in services-flake.

Nix package

  • Expose options to configure the Weaviate server
  • Enable startiing weaviate-server in a process-compose window upon nix run

Postgres: add `depends_on` option

This option will be useful to append extra depends_on configuration to the <name>-init process.

An example scenario is when a replica DB has to depend on another process that runs pg_basebackup.

Why are we overriding grafana?

There should be comment in this file saying why we must maintain our own grafana package:

https://github.com/juspay/services-flake/blob/main/test/overlays/grafana.nix


Such a comment already exists on pgAdmin:

# Because tests are failing on darwin: https://github.com/juspay/services-flake/pull/115#issuecomment-1970467684
pgadmin4 = prev.pgadmin4.overrideAttrs (_: { doInstallCheck = false; });

(Though here, this overlay should only apply on darwin).

Add `testScript` to individual services

For eg., both pg1 and pg2 submodules can provide their own testScript

services.postgres."pg1" = {
enable = true;
listen_addresses = "127.0.0.1";
initialScript.before = "CREATE USER bar;";
initialScript.after = "CREATE DATABASE foo OWNER bar;";
};
services.postgres."pg2" = {
enable = true;
listen_addresses = "127.0.0.1";
port = 5433;
};

Do after #32

Write docs for all services

The following services are missing docs:

  • Apache Kafka
  • [[clickhouse]]#
  • Elasticsearch
  • MySQL
  • Nginx
  • PostgreSQL
  • Redis
  • Redis Cluster
  • Zookeeper
  • [[grafana]]#
  • [[prometheus]]#
  • [[pgadmin]]#
  • [[cassandra]]#

Multiple services

cf. "This also solves the problem of only being able to run one instance at a time." of cachix/devenv#75

Consider this scenario

passetto adds a postgresql service. But the user (nammayatri) may add their own postgresql service. Can the two DB coexist?

What we can do

Should we adjust the module API to accomodate for this? For eg., instead of:

services.postgresql.enable = true;

do we instead do:

services.postgresql."some-name".enable = true;

And consistently use this API for all services?

Concrete goal

This should work in nammayatri which has its own postgresql, as well as a postgres db used for passeto.

clickhouse: `extraConfig` should be an attrset

extraConfig = lib.mkOption {
type = types.lines;
description = "Additional configuration to be appended to `clickhouse-config.yaml`.";
default = "";
};

This type makes the user specify raw config (an error-prone process as well):

      services.clickhouse."clickhouse-db" = {
        extraConfig = ''
          http_port: 8123
        '';
      };

But ideally it should be using freeformType (pkgs.formats.yaml). See example here. Because, then the user can just write:

      services.clickhouse."clickhouse-db" = {
        extraConfig.http_port = 8123;
      };

Formatting checks in CI is limited to `./dev`

...treefmt-check.drv logs from nixci:

treefmt-check> traversed 2 files
treefmt-check> matched 1 files to formatters
treefmt-check> left with 1 files after cache
treefmt-check> of whom 0 files were re-formatted
treefmt-check> all of this in 500ms

treefmt is only traversing the two files present in the dev sub dir.

Replace vm tests with native tests

Unlike NixOS services, process-compose programs created by services-flake do not mutate outside of dataDir, so we gain very little from the isolation provided by VMs.

Let's switch to running these tests natively, which gets us macOS support (resolving #14), in addition to allowing us to use Github Actions for CI (#11).

Implementation idea

Use process-compose itself to run tests. process-compose running another process-compose, with readiness checks triggering the test process. We can use bash, or better nushell, to write the tests in. Nushell, because it can easily process the JSON from process-compose API.

Kafka service doesn’t start on non-default port

How to reproduce?

nix run github:shivaraj-bh/nammayatri/2bf9276b157f92f0b898852128b9632c95a97bb1#services
Screenshot 2024-01-13 at 4 23 22 PM

As can be seen in the screenshot above, the kafka process is still trying to connect to port 9092, even though the configuration sets port to 29092

Possible solution

I am not entirely sure of the cause but I will try to rebase the kafka service on this: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/misc/apache-kafka.nix

And additionally include tests for this scenario.

Template to add a new service

A just command that takes the service name as an argument and creates nix/<service-name>.nix and nix/<service-name>_test.nix with a hello-world service example.

  • Maybe also add the files in the respective lists of nix/default.nix and test/flake.nix?

  • Update README to add the service

  • Comments specifying to take inspiration from either the nixos module or devenv module if either of them already implement the service.

postgres: provide a way to add to default settings instead of simply overriding all of them

the following defaults:

default = {
listen_addresses = config.listen_addresses;
port = config.port;
unix_socket_directories = lib.mkDefault config.dataDir;
hba_file = "${config.hbaConfFile}";

are needed for the setup to function correctly and would rarely need to be overriden

a user however would often want to add more settings

in my case I'd like to add shared_preload_libraries = 'timescaledb'

we could add extraSettings which would be merged with settings or move current defaults to defaultSettings and leave settings empty, for the user to set

Dealing with non-root $PWD

cd ./example
mkdir tmp
cd ./tmp
nix run

This creates the data diretory in ./example/tmp/data (instead of ./example/data as one would expect -- ie. in flake root). What is the best way to address this? In nammayatri we use flake-root + mission-control module to side-step the problem.

One possible solution here is to have services-flake require the use of flake-root module, and have the data directory use the flake root directory determined by the latter module.

CI: binary cache requests by github-runners are blocked

Due to large number of requests to fetch the cache stored in magic-nix-cache, github responds with:

response body:

       GitHub API error: API error (429 Too Many Requests): StructuredApiError { message: "Request was blocked due to exceeding usage of resource 'Count' in namespace ''." }; retrying in 257 ms

triggering rebuild of several packages.

Possible solution

Can we use garnix cache instead?

add baseDataDir which could be shared between services

I'd like to override default base dataDir for all services, without having to re-state service name in each override

I'd like to do this:

let
  baseDataDir = "./.data";
in {
  services.redis."redis-1" = {
    enable = true;
    inherit baseDataDir;
  };
  services.postgres."pg-1" = {
    enable = true;
    inherit baseDataDir;
  };
}

instead of:

let
  baseDataDir = "./.data";
in {
  services.redis."redis-1" = {
    enable = true;
    dataDir = "${baseDataDir}/redis-1";
  };
  services.postgres."pg-1" = {
    enable = true;
    dataDir = "${baseDataDir}/pg-1";
  };
}

while preserving the original data dir structure

see possible implementation here

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.