jeremyevans / sequel_postgresql_triggers Goto Github PK

Database enforced timestamps, immutable columns, and counter/sum caches

License: Other

Ruby 100.00%

sequel_postgresql_triggers's Introduction

Sequel PostgreSQL Triggers¶ ↑

Sequel PostgreSQL Triggers is a small enhancement to Sequel allowing a user to easily handle the following types of columns:

Timestamp Columns (Created At/Updated At)
Counter/Sum Caches
Immutable Columns
Touch Propogation
Foreign Key Arrays (Referential Integrity Checks)

It handles these internally to the database via triggers, so even if other applications access the database (without using Sequel), things will still work (unless the database superuser disables triggers).

To use this, load the pg_triggers extension into the Sequel::Database object:

DB.extension :pg_triggers

Then you can call the pgt_* methods it adds on your Sequel::Database object:

DB.pgt_created_at(:table_name, :created_at)

Most commonly, this is used in migrations, with a structure similar to:

Sequel.migration do
  up do
    extension :pg_triggers

    pgt_created_at(:table_name,
                   :created_at,
                   :function_name=>:table_name_set_created_at,
                   :trigger_name=>:set_created_at)
  end

  down do
    drop_trigger(:table_name, :set_created_at)
    drop_function(:table_name_set_created_at)
  end
end

Note that you only need to load this extension when defining the triggers, you don’t need to load this extension when your application is running.

To use any of these methods before PostgreSQL 9.0, you have to add the plpgsql procedural language to PostgreSQL, which you can do with:

DB.create_language(:plpgsql)

If you want to load this extension globally for all PostgreSQL databases, you can do:

require 'sequel_postgresql_triggers'

However, global modification is discouraged and only remains for backwards compatibility.

Triggers¶ ↑

All of the public methods this extension adds take the following options in their opts hash:

:function_name: The name of the function to use. This is important to specify if you want an easy way to drop the function.
:trigger_name: The name of the trigger to use. This is important to specify if you want an easy way to drop the trigger.

Created At Columns - pgt_created_at¶ ↑

pgt_created_at takes the table and column given and makes it so that upon insertion, the column is set to the CURRENT_TIMESTAMP, and that upon update, the column’s value is always set to the previous value. This is sort of like an immutable column, but it doesn’t bring up an error if you try to change it, it just ignores it.

Arguments:

table: name of table
column: column in table that should be a created at timestamp column
opts: option hash

Updated At Columns - pgt_updated_at¶ ↑

Similar to pgt_created_at, takes a table and column and makes it so that upon insertion, the column is set to CURRENT_TIMESTAMP. It differs that upon update, the column is also set to CURRENT_TIMESTAMP.

Arguments:

table: name of table
column: column in table that should be a updated at timestamp column
opts: options hash

Counter Cache - pgt_counter_cache¶ ↑

This takes many arguments and sets up a counter cache so that when the counted table is inserted to or deleted from, records in the main table are updated with the count of the corresponding records in the counted table. The counter cache column must have a default of 0 for this to work correctly.

Use pgt_sum_cache with a Sequel expression in summed_column to handle any custom logic such as a counter cache that only counts certain rows.

Arguments:

main_table: name of table holding counter cache column
main_table_id_column: column in main table matching counted_table_id_column in counted_table
counter_column: column in main table containing the counter cache
counted_table: name of table being counted
counted_table_id_column: column in counted_table matching main_table_id_column in main_table
opts: options hash

Sum Cache - pgt_sum_cache¶ ↑

Similar to pgt_counter_cache, except instead of storing a count of records in the main table, it stores the sum on one of the columns in summed table. The sum cache column must have a default of 0 for this to work correctly.

Use a Sequel expression in summed_column to handle any custom logic such as a counter cache that only counts certain rows, or a sum cache that sums the length of a string column.

Arguments:

main_table: name of table holding counter cache column
main_table_id_column: column in main table matching summed_table_id_column in summed_table
sum_column: column in main table containing the sum cache
summed_table: name of table being summed
summed_table_id_column: column in summed_table matching main_table_id_column in main_table
summed_column: column in summed_table being summed or a Sequel expression to be evaluated in the context of summed_table
opts: options hash

Sum Through Many Cache - pgt_sum_through_many_cache¶ ↑

Similar to pgt_sum_cache, except instead of a one-to-many relationship, it supports a many-to-many relationship with a single join table. The sum cache column must have a default of 0 for this to work correctly. Use a Sequel expression in summed_column to handle any custom logic. See pgt_sum_cache for details.

This takes a single options hash argument, supporting the following options in addition to the standard options:

:main_table: name of table holding sum cache column
:main_table_id_column: primary key column in main table referenced by main_table_fk_column (default: :id)
:sum_column: column in main table containing the sum cache, must be NOT NULL and default to 0
:summed_table: name of table being summed
:summed_table_id_column: primary key column in summed_table referenced by summed_table_fk_column (default: :id)
:summed_column: column in summed_table being summed or a Sequel expression to be evaluated in the context of summed_table, must be NOT NULL
:join_table: name of table which joins main_table with summed_table
:join_trigger_name: name of trigger for join table
:join_function_name: name of trigger function for join table
:main_table_fk_column: column in join_table referencing main_table_id_column, must be NOT NULL
:summed_table_fk_column: column in join_table referencing summed_table_id_column, must be NOT NULL

Immutable Columns - pgt_immutable¶ ↑

This takes a table name and one or more column names, and adds an update trigger that raises an exception if you try to modify the value of any of the columns.

Arguments:

table: name of table
*columns: All columns in the table that should be immutable. Can end with options hash.

Touch Propagation - pgt_touch¶ ↑

This takes several arguments and sets up a trigger that watches one table for changes, and touches timestamps of related rows in a separate table.

Arguments:

main_table: name of table that is being watched for changes
touch_table: name of table that needs to be touched
column: name of timestamp column to be touched
expr: hash or array that represents the columns that define the relationship
opts: options hash

Foreign Key Arrays - pgt_foreign_key_array¶ ↑

This takes a single options hash, and sets up triggers on both tables involved. The table with the foreign key array has insert/update triggers to make sure newly inserted/updated rows reference valid rows in the referenced table. The table being referenced has update/delete triggers to make sure the value before update or delete is not still being referenced.

Note that this will not catch all referential integrity violations, but it should catch the most common ones.

Options:

:table: table with foreign key array
:column: foreign key array column
:referenced_table: table referenced by foreign key array
:referenced_column: column referenced by foreign key array (generally primary key)
:referenced_function_name: function name for trigger function on referenced table
:referenced_trigger_name: trigger name for referenced table

Force Defaults - pgt_force_defaults¶ ↑

This takes 2 arguments, a table and a hash of column default values, and sets up an insert trigger that will override user submitted or database default values and use the values given when setting up the trigger. This is mostly useful in situations where multiple database accounts are used where one account has insert permissions but not update permissions, and you want to ensure that inserted rows have specific column values to enforce security requirements.

Arguments:

table: The name of the table
defaults: A hash of default values to enforce, where keys are column names and values are the default values to enforce

JSON Audit Logging - pgt_json_audit_log_setup and pg_json_audit_log¶ ↑

These methods setup an auditing function where updates and deletes log the previous values to a central auditing table in JSON format.

pgt_json_audit_log_setup¶ ↑

This creates an audit table and a trigger function that will log previous values to the audit table. This returns the name of the trigger function created, which should be passed to pgt_json_audit_log.

Arguments:

table: The name of the table storing the audit logs.

Options:

function_opts: Options to pass to create_function when creating the trigger function.

The audit log table will store the following columns:

txid: The 64-bit transaction ID for the transaction that made the modification (txid_current())
at: The timestamp of the transaction that made the modification (CURRENT_TIMESTAMP)
user: The database user name that made the modification (CURRENT_USER)
schema: The schema containing the table that was modified (TG_TABLE_SCHEMA)
table: The table that was modified (TG_TABLE_NAME)
action: The type of modification, either DELETE or UPDATE (TG_OP)
prior: A jsonb column with the contents of the row before the modification (to_jsonb(OLD))

pgt_json_audit_log¶ ↑

This adds a trigger to the table that will log previous values to the audting table for updates and deletes.

Arguments:

table: The name of the table to audit
function: The name of the trigger function to call to log changes

Note that it is probably a bad idea to use the same table argument to both pgt_json_audit_log_setup and pgt_json_audit_log.

Caveats¶ ↑

If you have defined counter or sum cache triggers using this library before version 1.6.0, you should drop them and regenerate them if you want the triggers to work correctly with queries that use INSERT ... ON CONFLICT DO NOTHING.

When restoring a data-only migration with pg_dump, you may need to use --disable-triggers for it to restore correctly, and you will need to manually enforce data integrity if you are doing partial restores and not full restores.

License¶ ↑

This library is released under the MIT License. See the MIT-LICENSE file for details.

Author¶ ↑

Jeremy Evans <[email protected]>

sequel_postgresql_triggers's People

Contributors

Stargazers

Watchers

Forkers

chanks magnetised jeanmertz sylver vydia kenaniah marostr elixirator gencer paladinsoftware adam12 crofty wiecklabs

sequel_postgresql_triggers's Issues

Feature: filtering the (counted|summed)_table dataset

Here's a situation we'd love to be able to handle:

Counter cache of all unread notifications belonging to a user.

The first DSL that comes to mind is options named :where and :exclude which accept the same arguments as the dataset methods. Open to better ideas.

pgt_counter_cache(:users,
                  :id,
                  :unread_notifications_count,
                  :notifications,
                  :user_id,
                  :where => { :read_at => nil })

pgt_counter_cache(:users,
                  :id,
                  :unread_notifications_count,
                  :notifications,
                  :user_id,
                  :where => ["status = ?", "unread"])

pgt_counter_cache(:users,
                  :id,
                  :unread_notifications_count,
                  :notifications,
                  :user_id,
                  :exclude => proc { view_count > 0 })

pgt_counter_cache(:users,
                  :id,
                  :unread_notifications_count,
                  :notifications,
                  :user_id,
                  :where => Sequel.like(:status, "unread"))

Trigger implementation might look something like this, although populating those variables may be tricky:

# :invert option in the spirit of `add_filter`
# https://github.com/jeremyevans/sequel/blob/8852a359f1f20d4c3d6f9c62c150b96ea01a4511/lib/sequel/dataset/query.rb#L1218

if where = opts[:where]
  new_match_filters = qualify_for_new_table(where)
  old_match_filters = qualify_for_old_table(where)
  not_new_match_filters = qualify_for_new_table(where, :invert => true)
  not_old_match_filters = qualify_for_old_table(where, :invert => true)
elsif exclude = opts[:exclude]
  new_match_filters = qualify_for_new_table(exclude, :invert => true)
  old_match_filters = qualify_for_old_table(exclude, :invert => true)
  not_new_match_filters = qualify_for_new_table(exclude)
  not_old_match_filters = qualify_for_old_table(exclude)
end

BEGIN
  IF (TG_OP = 'UPDATE' AND (NEW.#{id_column} = OLD.#{id_column}) THEN
    IF ((#{new_match_filters} AND #{old_match_filters}) OR (#{not_new_match_filters} AND #{not_old_match_filters})) THEN
      RETURN NEW;
    ELSIF (#{old_match_filters} AND #{not_new_match_filters}) THEN
      UPDATE #{table} SET #{count_column} = #{count_column} - 1 WHERE #{main_column} = OLD.#{id_column};
    ELSIF (#{new_match_filters} AND #{not_old_match_filters}) THEN
      UPDATE #{table} SET #{count_column} = #{count_column} + 1 WHERE #{main_column} = NEW.#{id_column};
    END IF;
  ELSE
    IF (OLD.#{id_column} IS NULL AND NEW.#{id_column} IS NULL))) THEN
      RETURN NEW;
    END IF;
    IF ((TG_OP = 'INSERT' OR TG_OP = 'UPDATE') AND NEW.#{id_column} IS NOT NULL AND #{new_match_filters}) THEN
      UPDATE #{table} SET #{count_column} = #{count_column} + 1 WHERE #{main_column} = NEW.#{id_column};
    END IF;
    IF ((TG_OP = 'DELETE' OR TG_OP = 'UPDATE') AND OLD.#{id_column} IS NOT NULL AND #{old_match_filters}) THEN
      UPDATE #{table} SET #{count_column} = #{count_column} - 1 WHERE #{main_column} = OLD.#{id_column};
    END IF;
  END IF;
  IF (TG_OP = 'DELETE') THEN
    RETURN OLD;
  END IF;
  RETURN NEW;
END;

What do you think? I have no problem taking a stab at it if you think it's valuable and would work.

The triggers are not pg_dump-friendly

docker-compose.yml:

services:
  app:
    build: .
    command: sleep infinity
    init: true
    volumes:
      - .:/app
  db:
    image: postgres:15.3-alpine3.18
    environment:
      POSTGRES_HOST_AUTH_METHOD: trust
  db2:
    image: postgres:15.3-alpine3.18
    environment:
      POSTGRES_HOST_AUTH_METHOD: trust

Dockerfile:

FROM ruby:3.2.2-alpine3.18
RUN apk add build-base postgresql15-dev
WORKDIR /app
COPY Gemfile .
RUN bundle install

Gemfile:

source "https://rubygems.org"
gem 'sequel', '5.70.0'
gem 'pg'
gem 'sequel_postgresql_triggers', '1.5.0'

migrations/1_create_tables.rb:

Sequel.migration do
  change do
    create_table(:users) do
      primary_key :id
      String :name
    end
    create_table(:images) do
      primary_key :id
      foreign_key :user_id, :users
      String :filename
      TrueClass :deleted
    end
  end
end

migrations/2_add_trigger.rb:

Sequel.migration do
  up do
    extension :pg_triggers
    add_column :users, :image_count, :Integer, default: 0
    pgt_sum_cache(:users,
                  :id,
                  :image_count,
                  :images,
                  :user_id,
                  Sequel.case({ deleted: 0 }, 1),
                  function_name: :users_set_image_count,
                  trigger_name: :set_image_count)
  end
  down do
    drop_trigger(:users, :set_image_count)
    drop_function(:users_set_image_count)
    drop_column :users, :image_count
  end
end

$ docker-compose up -d
$ docker-compose exec app sequel -m migrations postgres://postgres@db
$ docker-compose exec db psql -U postgres \
    -c "insert into users (name) values ('John')"
$ docker-compose exec db psql -U postgres \
    -c "insert into images (user_id, filename) values (1, 'a.png')"
$ docker-compose exec -T db pg_dump -U postgres -s \
    | docker-compose exec -T db2 psql -U postgres
$ docker-compose exec -T db pg_dump -U postgres -a \
    | docker-compose exec -T db2 psql -U postgres
SET
SET
SET
SET
SET
 set_config 
------------
 
(1 row)

SET
SET
SET
SET
COPY 1
ERROR:  relation "users" does not exist
LINE 1: UPDATE "users" SET "image_count" = "image_count" + (CASE WHE...
               ^
QUERY:  UPDATE "users" SET "image_count" = "image_count" + (CASE WHEN NEW."deleted" THEN 0 ELSE 1 END) WHERE "id" = NEW."user_id"
CONTEXT:  PL/pgSQL function public.users_set_image_count() line 6 at SQL statement
COPY images, line 1: "1	1	a.png	\N"
COPY 1
 setval 
--------
      1
(1 row)

 setval 
--------
      1
(1 row)

If I migrate schema and data in one go, it succeeds. Why do I migrate schema first? Because often it's already there, because migrations has been applied. And what's left is to copy data.

Why does it fail? Apparently because pg_dump sets search_path to '':

$ docker-compose exec -T db pg_dump -U postgres -s
--
-- PostgreSQL database dump
--

-- Dumped from database version 15.3
-- Dumped by pg_dump version 15.3

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);

Should it be fixed? I'm not sure. On one hand you probably don't want to execute triggers when copying data (and as such want to use --disable-triggers, with which it succeeds). But if you copy partial data, you do. And the fact that it fails can remind you about this. Although the message is not too revealing. But then, the best you can achieve in terms of partial imports with pg_dump is probably to not import some tables, and then triggers won't help.

On a side note, I wonder if drop_(trigger|function) can be avoided (a change migration). And with an expression (Sequel.case({ deleted: 0 }, 1)) it fails to autogenerate the function name.

Feature: Migration `down` support for triggers that have been created with sequel_postgresql_triggers

There is no nice DSL for dropping the functions and/or triggers that are created using this plugin. At Vydia, we would like our migrations to be reversible when possible. @jeremyevans I would be happy to work on this if you think it's a good idea.

pgt_counter_cache and insert_conflict unexpected behavior

Hi,
Is there a reason for executing the pgt_counter_cache trigger before, and not after the original operation?
If insert in a counted table fails, the correspondent _count field increments anyway now.

Improve documentation with a list of steps to get started

This project could use a "Quick Start" guide in the readme. At this point, it's not very obvious how to use it - I can guess that you include the gem in you Gemfile, but then what? Is pgt_created_at a schema modification method that you can call during migrations?

I can probably figure out how to use it by reading through the source code and seeing what methods are defined, but if you make it more clear what to do with this gem in the documentation, you would probably get more people using and contributing to it. Just my two cents :)

pgt_created_at overriding created_at timestamp for tests

I'm running into an issue where I'm trying to create a test that ignores all models created more than a day ago. The issue that I am having is that I cannot override the created_at timestamp to be anything other than the current time.

For example:

ModelName.create(
      user_id: 1,
      created_at: 1.year.ago
    )

will contain a created_at timestamp of the current time. And time freezers like Timecop doesn't work since the timestamp is created on the database level.

Is there anything I can do to be able to pass in a created_at timestamp?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.