cdlsoftware / pg-mv-fast-refresh Goto Github PK
View Code? Open in Web Editor NEWpostgres MV fast refresh module
License: MIT License
postgres MV fast refresh module
License: MIT License
We’ve notice some problems with the drops of the MV’s using the functions and also dropping the MV logs did you test this out? We’re making these all test points for our pipeline so in future all this will be tested for any code changes.
So with the MV’s they drop ok but the numbers in the array field for the pg$mview_logs table look very wrong, check the output below I create 66 MVs and then drop them see the output from the pg$mview_logs.
-bash-4.1$ psql --host=$HOSTNAME --port=$PORT --username=$MVUSERNAME --dbname=$DBNAME
psql (9.5.12, server 11.5)
WARNING: psql major version 9.5, server major version 11.
Some psql features might not work.
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.
postgres=> select * from testpoc."pg$mview_logs"
;
owner | pglog$_name | table_name | trigger_name | pg_mview_bitmap
-------------+-------------+------------+--------------+-----------------
testpocdata | log$_test1 | test1 | trig$_test1 | {0}
testpocdata | log$_test2 | test2 | trig$_test2 | {0}
testpocdata | log$_test3 | test3 | trig$_test3 | {0}
testpocdata | log$_test4 | test4 | trig$_test4 | {0}
testpocdata | log$_test5 | test5 | trig$_test5 | {0}
testpocdata | log$_test6 | test6 | trig$_test6 | {0}
(6 rows)
postgres=> DO
postgres->
postgres$> DECLARE
postgres$> tStartTime TIMESTAMP := clock_timestamp();
postgres$> cResult CHAR(1) := NULL;
postgres$> iTableCounter SMALLINT := 10;
postgres$> pSqlStatement TEXT;
postgres$> BEGIN
postgres$> pSqlStatement := 'SELECT test1.id test1_id,
postgres$> test1.lookup_id test1_lookup_id,
postgres$> test1.code test1_code,
postgres$> test2.id test2_id,
postgres$> test2.description test2_desc,
postgres$> test2.metavals_id test2_metavals_id,
postgres$> test2.age test2_age,
postgres$> test3.lookup_id test3_lookup_id,
postgres$> test3.lookup_code test3_lookup_code,
postgres$> test3.lookup_description test3_lookup_desc,
postgres$> test4.metavals_id test4_metavals_id,
postgres$> test4.code test4_code,
postgres$> test4.description test4_desc,
postgres$> test5.id test5_id,
postgres$> test5.rep_ind test5_rep_ind,
postgres$> test5.trans_id test5_trans_id,
postgres$> test6.trans_id test6_trans_id,
postgres$> test6.payment_reference test6_payment_ref
postgres$> FROM
postgres$> test1
postgres$> INNER JOIN test2 ON test1.id = test2.id
postgres$> LEFT JOIN test3 ON test1.lookup_id = test3.lookup_id
postgres$> LEFT JOIN test4 ON test2.metavals_id = test4.metavals_id
postgres$> INNER JOIN test5 ON test1.id = test5.id
postgres$> LEFT JOIN test6 ON test5.trans_id = test6.trans_id';
postgres$>
postgres$> FOR iTableCounter IN 10 .. 75
postgres$> LOOP
postgres$> cResult := mv$createMaterializedView
postgres$> (
postgres$> pViewName => 'mvtesting' || iTableCounter,
postgres$> pSelectStatement => pSqlStatement,
postgres$> pOwner => 'testpocview',
postgres$> pFastRefresh => TRUE
postgres$> );
postgres$> END LOOP;
postgres$>
postgres$> RAISE NOTICE 'Simple Snapshot Creation took %', clock_timestamp() - tStartTime;
postgres$> END
postgres$>
NOTICE: Simple Snapshot Creation took 00:00:04.714461
CONTEXT: PL/pgSQL function inline_code_block line 45 at RAISE
DO
postgres=> select * from testpoc."pg$mview_logs"
;
owner | pglog$_name | table_name | trigger_name | pg_mview_bitmap
-------------+-------------+------------+--------------+-------------------------
testpocdata | log$_test1 | test1 | trig$_test1 | {9223372036854775807,7}
testpocdata | log$_test2 | test2 | trig$_test2 | {9223372036854775807,7}
testpocdata | log$_test3 | test3 | trig$_test3 | {9223372036854775807,7}
testpocdata | log$_test4 | test4 | trig$_test4 | {9223372036854775807,7}
testpocdata | log$_test5 | test5 | trig$_test5 | {9223372036854775807,7}
testpocdata | log$_test6 | test6 | trig$_test6 | {9223372036854775807,7}
(6 rows)
postgres=> DO
postgres->
postgres$> DECLARE
postgres$> cResult CHAR(1) := NULL;
postgres$> iTableCounter SMALLINT := 10;
postgres$> BEGIN
postgres$> FOR iTableCounter IN 10 .. 75
postgres$> LOOP
postgres$> cResult := mv$removeMaterializedView( 'mvtesting' || iTableCounter, 'testpocview' );
postgres$> END LOOP;
postgres$> END
postgres$>
DO
postgres=> select * from testpoc."pg$mview_logs"
;
owner | pglog$_name | table_name | trigger_name | pg_mview_bitmap
-------------+-------------+------------+--------------+---------------------------
testpocdata | log$_test1 | test1 | trig$_test1 | {6917529027641081856,-10}
testpocdata | log$_test2 | test2 | trig$_test2 | {6917529027641081856,-10}
testpocdata | log$_test3 | test3 | trig$_test3 | {6917529027641081856,-10}
testpocdata | log$_test4 | test4 | trig$_test4 | {6917529027641081856,-10}
testpocdata | log$_test5 | test5 | trig$_test5 | {6917529027641081856,-10}
testpocdata | log$_test6 | test6 | trig$_test6 | {6917529027641081856,-10}
(6 rows)
postgres=>
Now dropping the logs I would not expect them to drop when we have these wild numbers but I’ve just tried creating the logs with no MV’s and then tried dropping the logs and we get errors see below.
postgres=> select * from testpoc."pg$mview_logs"
postgres-> ;
owner | pglog$_name | table_name | trigger_name | pg_mview_bitmap
-------------+-------------+------------+--------------+-----------------
testpocdata | log$_test1 | test1 | trig$_test1 | {0}
testpocdata | log$_test2 | test2 | trig$_test2 | {0}
testpocdata | log$_test3 | test3 | trig$_test3 | {0}
testpocdata | log$_test4 | test4 | trig$_test4 | {0}
testpocdata | log$_test5 | test5 | trig$_test5 | {0}
testpocdata | log$_test6 | test6 | trig$_test6 | {0}
(6 rows)
postgres=> DO
DECLARE
cResult CHAR(1) := NULL;
BEGIN
cResult := mv$removeMaterializedViewLog( 'test1', 'testpocdata' );
cResult := mv$removeMaterializedViewLog( 'test2', 'testpocdata' );
cResult := mv$removeMaterializedViewLog( 'test3', 'testpocdata' );
cResult := mv$removeMaterializedViewLog( 'test4', 'testpocdata' );
cResult := mv$removeMaterializedViewLog( 'test5', 'testpocdata' );
cResult := mv$removeMaterializedViewLog( 'test6', 'testpocdata' );
END
INFO: Exception in function mv$removeMaterializedViewLog
CONTEXT: PL/pgSQL function "mv$removematerializedviewlog"(text,text) line 60 at RAISE
PL/pgSQL function inline_code_block line 5 at assignment
INFO: Error 42883:- operator does not exist: bigint[] = smallint:
CONTEXT: PL/pgSQL function "mv$removematerializedviewlog"(text,text) line 61 at RAISE
PL/pgSQL function inline_code_block line 5 at assignment
ERROR: 42883
CONTEXT: PL/pgSQL function "mv$removematerializedviewlog"(text,text) line 62 at RAISE
PL/pgSQL function inline_code_block line 5 at assignment
Hi,
I'm not able to see creatematerialisedview under functions after running runCreateFastRefreshModule.sh.
Log file is attached:
fast_refresh_module_install_20240124-1608.log
Following are the functions that are created:
Hello,
I have encountered two bugs when using this module.
If the underlying tables are partitioned, or the MV is a group by query, the proccess return an error and VM will be no create
Thanks&Regards
INFO: Exception in function mv$insertOuterJoinRows
INFO: Error 22004:- query string argument of EXECUTE is null:
INFO: Error Context:
INFO: Exception in function mv$executeMVFastRefresh
INFO: Error P0001:- 22004:
INFO: Exception in function mv$refreshMaterializedViewFast
INFO: Error P0001:- P0001:
INFO: Error Context:
SELECT m_row$,sequence$,dmltype$ FROM comfort0_live.log$_test_table WHERE bitmap$[1] & 32768 = 32768 ORDER BY sequence$
INFO: Exception in function mv$refreshMaterializedViewFull
INFO: Error P0001:- P0001:
ERROR: P0001
Root Cause:-
The defect is down to the function mv$checkIfOuterJoinedTable - this checks the table array value whether it exists in the outer join array. This does not consider if the table is part of an inner join condition and outer join condition. So, what's happening is when we've got both in a query it's treating the inner join table as an outer join.
Solution:
Changed the function to pass the outer join array value not full array and confirm if the table value matches the outer join position value which should always be the case based on the the outer join array only being populated with a value if it's an outer join table if not there is a value of null which means this is an inner join table.
During the fast refresh process i am seeing the following error.
The below is based on a simplified scenario example of what i'm seeing on my database.
CONTEXT: PL/pgSQL function inline_code_block line 12 at RAISE
ERROR: duplicate key value violates unique constraint "pk_test_17"
DETAIL: Key (top_id)=(1234567) already exists.
After adding additional debugging into the fast refresh module. The mv$executeMVFastRefresh function which is used to apply the log changes to the mview table is failing when executing the following routine.
Type of log change: DML TYPE = INSERT for outer join rows
Calls function mv$insertouterjoinrows
DELETE FROM dd_test.mv_test WHERE c_m_row$ IN ( SELECT c.m_row$ FROM a_tab a
LEFT JOIN b_tab b ON a.top_id = b.top_id
LEFT JOIN c_tab c ON a.medium_id = c.medium_id
LEFT JOIN d_tab d ON a.low_id= d.low_id
WHERE
d.m_row$ IN ( SELECT UNNEST($1)))
This considers the previous join INSERT and remove the rows based on the previous c_m_row$ column.
INSERT INTO dd_test.mv_test ( cols..... )
SELECT cols ...
FROM a_tab a
LEFT JOIN b_tab b ON a.top_id = b.top_id
LEFT JOIN c_tab c ON a.medium_id = c.medium_id
LEFT JOIN d_tab d ON a.low_id= d.low_id
WHERE
d.m_row$ IN ( SELECT UNNEST($1))
When the INSERT runs we get this error:-
CONTEXT: PL/pgSQL function inline_code_block line 12 at RAISE
ERROR: duplicate key value violates unique constraint "pk_mv_test_17"
DETAIL: Key (top_id)=(1234567) already exists.
Conclusion:
In the function description it states
When inserting data into a complex materialized view, it is possible that a previous insert has already inserted the row that we are about to insert if that row is the subject of an outer join or is a parent of multiple new rows. When applying updates to the materialized view it is possible that the row being updated has subsequently been deleted, so before we can apply an update we have to ensure that the base row still exists. So to remove the possibility of duplicate rows we have to look to see if this situation has occurred.
So, at the moment we only consider the previous join condition m_row$ uuid's and do not consider other inserts may have already inserted the same row or multiple rows doing this process.
If your materialized view is linked to a composite primary key - this potentially could hit some refresh issues. Logic specific to our business needs has been added to handle this scenario for certain materialized views. This has been labelled CDL Specific in the comments if you'd like to review what has been done.
This will needs reviewing in full for a permanent solution. 95% of our materialized reviews don't use composite primary keys.
Hello,
I'm trying to create different mv which uses a common base table.
DO
INFO: Exception in function mv$addRow$ToMv$Table
INFO: Error 42701:- column "on_m_row$" of relation "mv_merchants_by_bd" already exists:
INFO: Error Context:
ALTER TABLE spcdsview.mv_merchants_by_bd
ADD COLUMN on_m_row$ UUID
ERROR: 42701
CONTEXT: PL/pgSQL function "mv$addrow$tomv$table"("mv$allconstants",text,text,text[],text[],text,text) line 66 at RAISE
SQL statement "SELECT
pViewColumns,
pSelectColumns
FROM
mv$addRow$ToMv$Table
(
rConst,
pOwner,
pViewName,
tAliasArray,
tRowidArray,
tViewColumns,
tSelectColumns
)"
PL/pgSQL function "mv$creatematerializedview"(text,text,text,text,text,boolean) line 111 at SQL statement
PL/pgSQL function inline_code_block line 25 at assignment
Is it possible to have more than one view referencing to the same table in current implementation?
Thanks
Deadlock error occurs when multiple mview refreshes against the same log is trying to clear the same mview log table bits. When there is a larger backlog of data to clear we've seen this deadlock issue on small occasions.
This is the error:-
INFO: Exception in function mv$clearPgMvLogTableBits
INFO: Error 40P01:- deadlock detected:
INFO: Error Context:
UPDATE test_schema.log$_test SET bitmap$[1] = bitmap$[1] - 2 WHERE sequence$ IN ( SELECT sequence$ FROM test_schema.log$_test WHERE bitmap$[1] & 2 = 2 AND sequence$ <= 2293)
INFO: Exception in function mv$refreshMaterializedViewFast
INFO: Error P0001:- 40P01:
INFO: Error Context:
SELECT m_row$,sequence$,dmltype$ FROM comfort0_live.log$_prtyinst WHERE bitmap$[1] & 2 = 2 ORDER BY sequence$
INFO: Exception in function mv$refreshMaterializedViewFull
INFO: Error P0001:- P0001:
ERROR: P0001
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.