jackc / tern Goto Github PK

View Code? Open in Web Editor NEW

827.0 9.0 65.0 1.1 MB

The SQL Fan's Migrator

License: MIT License

Go 100.00%

tern's People

Contributors

Stargazers

Watchers

tern's Issues

orthogonal migration support (multiple migrations).

This one is connected to #6. We are building a JS framework ontop of plv8, one of the things we need to do is manage migrations for parts of the JS separately from the schema elements. The JS would be upserted into the tables when a version bump occurs.

For this I need to manage versions -- tern's version table does not contain a migration name.

If we can add a name field to the table it would enable the following method:

func (m*Migrator) UserMigration(name string, currentVersion uint8, um function(*pgx.Conn) error) (previous uint8, err error) {
   // The user function is reached if currentVersion > version in db table.
   // if this returns no error than the db version table row with name is updated to currentVersion
}

The version table could be migrated to contain a name field and the legacy nameless migrations could be given the name "default" ?

I know this is probably crossing the use-case boundaries. But, since pgx does so much more than an average pg driver -- perhaps it makes sense.

Run Migration not in a transaction

I can see in package migrate that there is an option to DisableTx,
However I can't figure out how to disable them when simply calling the tern command line.
If I have 5 migration files, and 1 of them should not be run inside of a transaction, how do I do this?

Confirmation step for production

It might be nice to optionally have some sort of guard rail or required confirmation step for production migrations. I typically use ctrl+r to get my migrate command from my shell history. It is too easy to accidentally get and run the production migrate command when wanting to migrate development.

It would be nice to have a confirmation step that required typing the name of the environment.

This could be toggled in the config files locally or with a special table remotely. Local config might be simpler, but it could be more difficult when deploy target is determined by environment variable or program argument. Also, if it was a remote table it could not be bypassed as easily.

Allow override of config file

It would be great if it was possible to override values in the config file via environment variables or flags passed to tern during invocation.

For example, given a standard tern config, override host via:

PGHOST=foo tern migrate

tern --pg-host=foo migrate

This is very useful for certain environments such as CI where the database could be run in a linux container that is linked into the container currently running the test suite. For example, in Docker when you link two containers it exports certain environment variables, such as $PG_PORT_5432_TCP_ADDR. It would be great to invoke tern like so:

PGHOST=$PG_PORT_5432_TCP_ADDR tern migrate or via the flag approach above.

Wrapping migration files in transactions

Hello,

since tern use transactions for each migration, is it safe to wrap each migration file between BEGIN/COMMIT statements? We are migrating from golang-migrate to tern and we have these, so I wonder if I need to remove them. It appears to work just fine, but would love to hear an official word :)

Thanks!

Another approach to conflicting migrations

Would be awesome if tern has the option to choose how prefix number is generated when create a migration file.
To avoid migration conflicts, "rails like" approach is more simple and effective.
For example, instead generate a sequential number, tern would generate a timestamp numbered prefix, e.g. 20220512190701_create_foo

What is best strategy to create trigger?

Hello,
What is best strategy to create trigger?
I'm trying to create trigger in same migration file where create table is:

-- This is a sample migration.
create table people(
  id serial primary key,
  first_name varchar not null,
  last_name varchar not null,
  created_at timestamp without time zone NOT NULL DEFAULT CURRENT_TIMESTAMP,
  updated_at timestamp without time zone,
);

CREATE TRIGGER set_update_timestamp
    BEFORE UPDATE 
    ON people
    FOR EACH ROW
    EXECUTE PROCEDURE trigger_set_timestamp();

---- create above / drop below ----

drop table people;

I got error:
ERROR: syntax error at or near ")" (SQLSTATE 42601)
LINE 9: )
^

Archiving old migrations

After hundreds of migrations the migrations directory can get a little awkward to scroll through. It would be nice to have some way to move old migrations to a subdirectory to get them out of the way.

session-level only advisory lock

tern uses session level advisory lock that may be a problem for systems with pgbouncer, cause pgbouncer reuses session.
It would be great to have ability to use transaction level advisory locks.

I've forked the project and tried to change advisory lock type, but it can't be done without serious changes of the project structure.
It may be worth to do locking at higher level (not inside MigrateTo & ensureSchemaVersionTableExists), to allow user to select correct locking strategy for their infrastructure.

Thanks.

Small bug with file suffixes (duplicate sql suffix)

This probably is not considered a bug but it is a small inconvinience nevertheless and it is at the end the user's fault anyway.

when creating a new migration with the tern new command if you accidentally write the .sql suffix yourself then the file name will have the .sql suffix twice

Example

when running the following command:

tern new create_user.sql

the file it would create would be named: 001_create_user.sql.sql

Another proposal for selectively disabling transactional migrations

See #36 and #17 (comment) for more of the origin of this request.

Some commands cannot be run in a migration. e.g. create index concurrently. Even if DisableTx is used only one statement can be given per migration because the entire migration is sent as a single simple protocol query which is implicitly transactional. See #43 for a failed idea of removing explicit transactions.

There are two major problems.

DisableTx is awkward to use and is a a blunt instrument. i.e. it is not controllable per migration. I propose replacing that with a magic comment per migration. Something like ---- disable-tx ---- at the top.
The only way I can see to have multiple statements in a single migration that is not implicitly transactional is to parse the migration and send each SQL statement as its own simple protocol query message. I'm not super eager to require a SQL parser, but in this case all it needs to do is split each statement. It doesn't need to actually understand the statement. I've already written a parser for pgx to sanitize SQL parameters which detects comments and quoted strings. It might be possible to adapt that to split on any ; that is not in a comment or quote without too much effort.

SSH tunnel fails: "ssh: must specify HostKeyCallback"

This is what I did:

Installed go and tern
Configured a tern.conf file with [database] and [ssh-tunnel] entries
Added SSH private key to ssh-agent
Created a dummy migration file
Tried to execute tern migrate but got the following error:

Unable to connect to PostgreSQL:
  ssh: must specify HostKeyCallback

Seems that from version 1.8.3, Go made some braking changes on SSL connections: golang/go#19767

port ignored in tern.conf

System: macOS Monterey version 12.4
tern version: 1.13.0

I'm currently facing a weird issue where port is being ignored in tern.conf. I specified it to be 9200 and tern still uses 5432. I have a docker-compose file that looks like this:

version: "3"
services:
  db:
    container_name: postgres
    restart: always
    image: postgres
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: tern_test
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
    ports:
      - "9200:5432"
volumes:
  postgres_data:

I started this up in a separate terminal with docker-compose up, I can confirm the pg server is up and running and listening for connections. I then have the following tern.conf:

[database]
host = 127.0.0.1
port = 9200
database = tern_test
user = postgres
password = {{env "MIGRATOR_PASSWORD"}}
sslmode = prefer

When I run tern with MIGRATOR_PASSWORD=postgres tern migrate, I get the following error:

Unable to connect to PostgreSQL:
  failed to connect to `host=127.0.0.1 user=postgres database=tern_test`: dial error (dial tcp 127.0.0.1:5432: connect: connection refused)

If I change my docker-compose.yml to expose it on local port 5432, it works of course. I've also tried other ports and it only uses 5432. I'm using macOS Monterey version 12.4 (I saw another issue here specific to Windows, not sure if the OS is even relevant here).

Running migrations in a schema

I'd like to run all migrations in a specific schema; the "obvious" way to do this is something like:

db, err := pgx.Connect(ctx, "dbname=test")
_, err := db.Exec(ctx, "set search_path to myschema")

m, err := migrate.NewMigratorEx(...)

The problem with this is that MigrateTo() calls reset all over here: https://github.com/jackc/tern/blob/master/migrate/migrate.go#L372-L373

This resets the search_path and only the first migration will work (quite confusing!) The reason that reset all is in there is probably because people set the search_path in their migrations, in which case you don't want it to carry over to the next migration.

As a workaround, doing it in OnStart works:

m.OnStart = func(int32, string, string, string) {
    _, err = db.Exec(context.Background(), "set search_path to myschema")
    if err != nil {
        panic(fmt.Errorf("Migrate: %w", err))
    }
}

The downside of this is that there's no good way to signal errors from there, except a panic.

Not sure what the best way to fix this would be; adding an error return to OnStart isn't compatible, but could maybe add a SearchPath option? Or get the current search_path before running migrations and resetting it to that after every migration?

Purpose of MigratorOptions.FileSystem (v2-dev)

Hey,

I am looking into the v2-dev branch a bit and I have some concerns about the FileSystem value from the MigratorOptions type. What is it used for? It looks like both LoadMigrations and FindMigrations function do take fs.FS argument already. From the history it looks like it was part of the type, so it is perhaps a leftover? Is the intention to keep using the config type, or to pass the filesystem as arguments?

Thanks!

Migrations not actually run in a transaction?

Reading the code for migration, it looks like a transaction is started here:

tern/migrate/migrate.go

Line 339 in ebefbe4

tx, err = m.conn.Begin(ctx)

but tx is not actually used to execute the SQL, so the migration is not actually run within a transaction.

tern/migrate/migrate.go

Line 352 in ebefbe4

_, err = m.conn.Exec(ctx, sql)

@jackc: Am I reading that wrong or is there a bug here?

Can we specify migration metadata table name?

Hi,
we have 3 microservices and I would like to use tern with them. They share same database but different schemas and tables.

Can we specify the tablename used by tern to record migrations?

port in config file not respected

I have a non-standard port for a postgres server and specifying it in the tern.conf file did not pass it through to the connection properly (it still tried to use 5432). Specifying the custom port with --port on the command line did work.

Feature Request: Parse entire DSN from environment variable

A useful feature might be to allow parsing the DSN fully from the environment, as a decent number of services expose something like DATABASE_URL which is often in a url format. In those cases, it's more difficult to manually parse out the pieces into a config file.

Add support for fs.FS

Since go1.16 is released now I just added this to a project to integrate with embed.FS.

It would probably be handy to have built in and io.fs probably won't be going anywhere soon.

Something like this maybe with an exported constructor function. Maybe with build in support to specify a subdirectory NewIOFSMigratorFS(fs.FS) would make it clear which migratorfs it is but that name is kind of a mouthful.

type iofsMigratorFS struct{ fsys fs.FS }

func (m iofsMigratorFS) ReadDir(dirname string) ([]fs.FileInfo, error) {
	de, err := fs.ReadDir(m.fsys, dirname)
	if err != nil {
		return nil, err
	}
	var res []os.FileInfo
	for _, v := range de {
		fi, err := v.Info()
		if err != nil {
			return nil, err
		}
		res = append(res, fi)
	}
	return res, nil
}

func (m iofsMigratorFS) ReadFile(filename string) ([]byte, error) {
	return fs.ReadFile(m.fsys, filename)
}

func (m iofsMigratorFS) Glob(pattern string) (matches []string, err error) {
	return fs.Glob(m.fsys, pattern)
}

Feature: Docker image

For my projects I'm usually building a Docker image with the migration scripts included. Now I'm building images from scratch in various projects, but in order to safe some CI minutes I'm looking to build a single "tern" base image for all of my projects.

If you are interested in a Docker image for this repository, I can commit a Dockerfile here and (help) setup a CI environment. This can be either Github actions, or directly building it on Docker hub.

Configurable amount of leading 0s for migration files

I've noticed that the amount of leading zeros is hard-coded to 3.

While 999 is a lot, for a long-lived project, it's quite possible that it won't be enough. I did a test, and after going over that limit, tern new a creates 1000_a.sql, but then running tern new a again fails with the following message:

Error loading migrations:
  Missing migration 100

Configuration could be done via cli flag or .conf file. It could also be inferred from the other files in the migration folder (would require manual editing of the first migration to set the wanted number of leading zeros, but would work automatically from then on).

If nothing else, at least a better error message for the 1001st migration would be nice, as it now is it's a bit misleading.

I could try my hand at implementing this feature, if need be!

Roll up migrations

Sometimes it is convenient to roll up or flatten a range of migrations to a single migration. See #50 (comment) for initial request. I have also personally flattened migrations by hand multiple times.

There are two advantages to this feature.

Initial database setup may be quicker.
If the original migration files are removed it would declutter the migration directory. However, #50 could also solve this.

There are also a few downsides or tricky issues.

There is no way to perfectly roll up a set of migrations. When I have done this by hand I have used pg_dump to generate the flattened migration. However, migrations can depend on the database settings (default collations, permission grants, etc.), tern config variables, and environment variables. pg_dump freezes the results. This means the roll up migration could do something different than the original migrations.

The other is what to do with down migrations. If migrations 1 - 50 are rolled up into one migration what happens when we migrate down to 49. Or for that matter try to migrate up from 0 to a version less than 50. That would require keeping the old migrations around and deciding whether it is possible to use a roll up migration or fall back to the original migrations. Possible, but messy. Hmm... maybe down migrations always use the original migrations and up migrations only use a roll up if starting from 0 and going at least up to the end of the roll up.

Definitely a nice to have, but I'm not sure how much complexity it would introduce and if that complexity would be worth it.

Always use a schema for expressing the schema version table

With the currently unprefixed schema_version table the following sequence of events is possible.

First tern run creates schema_version in public
You run a migration that creates a schema with the same name as the user
Next run of tern now creates schema_version in $user
tern now attempts to re-apply migrations from 0, potentially causing havoc

Simply specifying versionTable: "public.schema_version" solves this. Using a pgx.Identifier for versionTable in type Migrator will encourage people to do this.

Add support for letting client manage the transaction for a migration.

Thanks for a great job on the driver. I am currently using the migration package directly. I need to manage the transaction a migration runs in as I am creating tenant schemas and stuff has to occur before and after the transactions. I am wondering if there is some way to detect if the connection already has a transaction associated with it -- does the TxStatus byte provide any clues ?.

If not could we get a new constructor NewMigratorNoTX that makes MigrateTo not use a Tx ? Happy to provide a PR.

support jackc/pgx/v4

support github.com/jackc/pgx/v4 in plans?

Tern installation doesn't seem to work as described

Not sure why, but I tried: go get -u github.com/jackc/tern in my project root, which has go.mod, etc.. and in my go.mod I see tern is added. However, running tern tells me its not installed. Does this not work when running it in a go mod based project? I also tried that in an empty directory, still no go. Is the above supposed to install it somehow so that it runs anywhere I am at?

It would be fantastic to see more details around how it may be used in a CI/CD setup, how would a developer typically use it in say a microservice or what not.. and how it would be incorporated in a new empty env that has a fresh install of the DB but nothing yet created.

More error info on migration error

When something goes wrong in a migration, display all the information possible to get back from the server.

Windows support

How can I use tern command in the windows?
Maybe, how can I create alias in windows?

Broken tests

Since the dep bump to pgx/v4 most of the tests are broken.

Functions like NewMigrator have a dependency on V3 of pgx, where in other parts of the tests we seem to have a rogue context injected into function calls m.MigrateTo(context.Background(), 3)

Remove explicit transactions from executing a migration

tern runs each migration within a transaction to ensure that the version table is also updated with the migration.

This can be inconvenient when running a migrations in an existing Tx context (#4) or when a migration needs to manage its own transactions (such as large data updates that are executed in multiple steps to avoid a long lock). #5 added the ability to disable transactions from code but it was never exposed to the CLI (#36).

I think I may have come up with an improvement that removes transactions and thereby any ability or interface to disable them.

The entire text of a migration is sent at once via the simple query protocol. The simple query protocol implicitly transactional unless the SQL contains explicit transaction control statements.

The only reason tern used transactions to begin with is to ensure that the migration and the update of the version table is atomic. But what if tern appended the update version table statement to the SQL of the migration before sending it all via the simple query protocol. The whole thing would be implicitly transactional, but migrations could still use begin and commit.

The only concern I have is if there are any SQL migration that somehow could be broken by appending this to it:

; -- in case migration SQL did not terminate last statement (legal in simple query protocol)
reset all;
update schema_version set version=42; -- this line would be generated to put in the table and version

I can't think of any way it could break, but I'd like to get some feedback on the idea.

SQL error should include detail

ERROR:  new row for relation "sometable" violates check constraint "check_something"
DETAIL:  Failing row contains (1151, 0, 2019-07-18 11:36:26.853391-05, 2019-07-18 11:36:26.853391-05).

instead of

ERROR:  new row for relation "sometable" violates check constraint "check_something"

[Question] pg_advisory_lock not supported by CockroachDB

Greetings,

This is my favorite migration tool. However, I just started a new project with CockroachDB, and on migration I get a warning

Error initializing migrator:
  ERROR: unknown function: pg_advisory_lock() (SQLSTATE 42883)

It looks like there are a couple places where the function is used in migrate.go. Is it your opinion that this lock function is critical to the library? It seems like cockroach always uses SERIALIZABLE isolation.

pre-/post-migration hooks from Go package

We'd like the ability to run arbitrary Go code as part of (before or after) a migration. One key point is that we want to be able to use the same Tx as the rest of the sql, so that the entire change is atomic.

I'd be happy to speculate about API and potentially contribute code, but I wanted to check in about the idea first.

Note that this could be a building block for #47, particularly if combined with https://github.com/mvdan/sh (which has the downside of being quite a large dependency).

Thanks!

Add timestamp column to version_table

Adding such column can be useful to track when the migration was run.

Race condition with initial migration

tern/migrate/migrate.go

Line 83 in 1ae9188

err = m.ensureSchemaVersionTableExists(ctx)

is run without locking, which means that

tern/migrate/migrate.go

Lines 372 to 377 in 1ae9188

 _, err = m.conn.Exec(ctx, fmt.Sprintf(` 

  create table if not exists %s(version int4 not null); 

  insert into %s(version) 

  select 0 

  where 0=(select count(*) from %s);

can fail if multiple migrations are running at the same time.

Here's an example log where three Go services are all running migration for an empty database:

2020-03-03 12:33:43.974 UTC [72] ERROR:  relation "version" does not exist at character 21
2020-03-03 12:33:43.974 UTC [72] STATEMENT:  select version from version
2020-03-03 12:33:43.980 UTC [74] ERROR:  relation "version" does not exist at character 21
2020-03-03 12:33:43.980 UTC [74] STATEMENT:  select version from version
2020-03-03 12:33:43.984 UTC [74] ERROR:  duplicate key value violates unique constraint "pg_type_typname_nsp_index"
2020-03-03 12:33:43.984 UTC [74] DETAIL:  Key (typname, typnamespace)=(version, 2200) already exists.
2020-03-03 12:33:43.984 UTC [74] STATEMENT:
create table if not exists version(version int4 not null);
|  
| insert into version(version)
| select 0
| where 0=(select count(*) from version);

for the caller it seems like the migration is failing, even though one of the migrations is running and can succeed.

I would expect that this part is also under the migration advisory lock. Or that the error can be ignored by the caller, e.g. using a specific error value.

Tern uses out of date version of pgx

Tern is currently using v3 of pgx. This is mostly a problem because it makes it impossible to use the migrate library in conjunction with an up to date version of pgx. Is there any reason not to quickly update this to v4?

Merging a branch with duplicate migration numbers

It is inconvenient to merge a branch when there are migrations with duplicate numbers. Rails resolved this by moving to timestamp migrations. I dislike this approach because it makes the order migrations are applied non-deterministic. However, it would be nice to have a simpler way to resolve this than manually renaming the offending files.

Broadly speaking there are two directions to go with this.

Integrate with version control to determine which files come from which branch. On the plus side it could probably resolve a whole set of migrations with minimal human involvement. On the down side this seems complicated, may add dependencies, and needs to be built for each VCS. It also wouldn't help if duplicate numbers were introduced somewhere other than in a merge.
An interactive renumber mode. Something like git add --patch where it shows each conflict and asks which one comes first.

Support TLS connections

Shell script migrations

It might be interesting for tern to have migrations that are simply shell scripts that can do arbitrary tasks. tern could act as a proxy to the PostgreSQL server and expose a pseudo connection via the PG* or DATABASE_URL environment variables. Then a migration could have any logic written in any language (even just using psql could be really valuable).

Security vulnerability detected in current stable version

Hello!

Our security scanners ring because of outdated golang.org/x/text/language in the binary. Would it be possible to bump golang.org/x/text to 0.3.7?

Thanks for this great tool and have a good day.

Expose CLI as an importable package

I want to embed migration logic into my application so I don't need to install tern into the container separately.

I can do that by importing github.com/jackc/tern/migrate and creating a new migrator like is done in the tern CLI:

tern/main.go

Line 411 in 7208d82

migrator, err := migrate.NewMigrator(ctx, conn, config.VersionTable)

However, I don't really need to customize that process in any way. For my purposes I'd be just copying most of func Migrate from that file.

It would be a nicer alternative if I could do this:

args := [0]string{}
tern.Migrate(cobra.Command{}, args)

In fact, my use case would be to just pass the args from my own cobra command which is powering my CLI:

func RunMigrate(cmd *cobra.Command, args []string) {
	tern.Migrate(cmd, args)
}

So I could run my-command migrate ... which would just run tern.

Would you consider supporting this and/or approving a PR?

HOW use default param in template term.conf

How to use {{env "DB_PASSWORD" "MY_DEFAULT_PASSWORD"}}

Ignored `port` from config in Windows

Windows 11 Pro N, 21H2
Tern: 1.12.5
PostgreSQL 13.4 in Docker for Windows.
Docker uses WSL2 as a backend.
WSL OS: Fedora Server 34.
Port mapping in docker enabled: 5432 container -> 5431 host.

I built two different binaries: for unix:amd/64, windows:amd/64.
If I run an unix's binary it's OK. It can connect, do necessary work, etc.
If I run window's binary I getting an error "Failed to connect" and I see a default pg port: 5432, not those which is specified in config.

Output:

iyuryevich@ElevenStation ~/Development/___/migrations/ru (master+*?) $ tern.exe status
Unable to connect to PostgreSQL:
  failed to connect to `host=127.0.0.1 user=test1 database=test_dev_01`: dial error (dial tcp 127.0.0.1:5432: connectex: No connection could be made because the target machine actively refused it.)
iyuryevich@ElevenStation ~/Development/___/migrations/ru (master+*?) $ tern status
status:   up to date
version:  4 of 4
host:     127.0.0.1
database: test_dev_01
iyuryevich@ElevenStation ~/Development/___/migrations/ru (master+*?) $ tern.exe version
tern v1.12.5
iyuryevich@ElevenStation ~/Development/___/migrations/ru (master+*?) $ tern version
tern v1.12.5
iyuryevich@ElevenStation ~/Development/___/migrations/ru (master+*?) $

tern.conf

[database]
host = 127.0.0.1
port = 5431
database = test_dev_01
user = test1
password = root
version_table = schema_version

Add migrations path to config

My use case: I always run tern <command> --migrations ./migrations ... so the config is in the root of the project and migrations are in ./migrations.
So I would like to omit --migrations ./migrations when calling commands. What if we can have it in the config?

New release with version using pgx v5?

Hi @jackc,

Any plans on releasing a new version soon? Thanks!

btw, I've tested it on my project (using the API rather than the CLI), and it worked fine!

go-bindata support

I am embedding migrations into the product I am working on. I could do with a go-bindata support in the Migrator. go-bindata generated files have no dependencies -- which fits with pgx / tern.

Use tern/migrate with pgx v4 and pgxpool?

Is it possible to use tern/migrate to embed migration functionality into an application that uses pgx/v4 and pgxpool?

Pre-built binaries

Right now tern is installable via the go toolkit (go install). It would be good to have trusted pre-built binaries for a few platforms e.g. amd64, arm64 hosted on this repo via Github. This will allow people to fetch the binaries into their workflows or deployment/release cycle e.g. Docker images or Kubernetes init container.

I can send in a PR for this. Starting with amd64 with the following build options:

flags: -s -w
env: CGO_ENABLED=0

Let me know if this is within the scope of this repo.

Related: #56

V2 Release?

Are there any plans to release the v2-experiment branch as a new version of tern? Opening up this issue just to get some information.

https://github.com/jackc/tern/tree/v2-experiment

If there's any blocking work for release, I can probably make some time to help as well.

#29 (comment)

	_, err = m.conn.Exec(ctx, fmt.Sprintf(`
	create table if not exists %s(version int4 not null);

	insert into %s(version)
	select 0
	where 0=(select count(*) from %s);

jackc / tern Goto Github PK

tern's People

Contributors

Stargazers

Watchers

Forkers

tern's Issues

Example

System: macOS Monterey version 12.4 tern version: 1.13.0

Recommend Projects

Recommend Topics

Recommend Org

System: macOS Monterey version 12.4
tern version: 1.13.0