Comments (9)
Interesting.
What pg_sample does is copy a subset of data from each table into temp tables under the "sample_schema". If then continues to recursively add data to them to satisfy any FK constraints. Then the result is written to stdout as a dump under the original table names.
How are the permissions set? Is it possible to only allow writing to a specific schema?
from pg_sample.
It seems I only have SELECT permissions on this particular DB unfortunately.
Do you have any alternative to this by any chance? The script is more or less fantastic for my use case but just have this one hurdle.
from pg_sample.
I'm not sure.
There are ways to do cross-database queries, apparently. I've never tried any of them:
https://stackoverflow.com/questions/46324/possible-to-perform-cross-database-queries-with-postgresql
So if we could run all the queries on the read-only but write them to a separate writable database then that might work, but I'm sure would require some changes. Would need to experiment with it. I'll try to read up a bit on the options if you think that's a possible route?
from pg_sample.
That seems like an awesome solution!
I think it would be really helpful for not only me but anyone else who comes across a similar issue with restrictions.
I'm also going to look into this on my end. 👍
Thank you
from pg_sample.
Ok, I think I have something kinda sorta working.
I created the foreign db connection with postgres_fdw. I connected the "public" remote schema to a local schema. The way the existing code iterates over the tables doesn't bring back the foreign tables, so you need this patch:
diff --git a/pg_sample b/pg_sample
index 07cb37d..02ca042 100755
--- a/pg_sample
+++ b/pg_sample
@@ -506,16 +506,23 @@ notice "[limit] $_->[0] = $_->[1]\n" foreach @limits;
# more than --limit rows initially.
my @tables;
my %sample_tables; # real table name -> sample table name
-my $sth = $dbh->table_info(undef, undef, undef, 'TABLE');
+my $sth = $dbh->prepare(q{
+ SELECT *
+ FROM information_schema.tables
+ WHERE table_type NOT IN ('VIEW')
+});
+$sth->execute;
while (my $row = lower_keys($sth->fetchrow_hashref)) {
- next unless uc $row->{table_type} eq 'TABLE'; # skip SYSTEM TABLE values
- next if $row->{table_schem} eq 'information_schema'; # special pg schema
- next if $opt{schema} && $row->{table_schem} ne $opt{schema};
+ #use Data::Dumper;
+ #warn Dumper($row), "\n";
+ # next unless uc $row->{table_type} eq 'TABLE'; # skip SYSTEM TABLE values
+ next if $row->{table_schema} eq 'pg_catalog'; # special pg schema
+ next if $opt{schema} && $row->{table_schema} ne $opt{schema};
- my $sname = $row->{pg_schema} || unquote_identifier($row->{TABLE_SCHEM})
+ my $sname = unquote_identifier($row->{table_schema})
or die "no pg_schema or TABLE_SCHEM value?!";
- my $tname = $row->{pg_table} || unquote_identifier($row->{TABLE_NAME})
+ my $tname = unquote_identifier($row->{table_name})
or die "no pg_table or TABLE_NAME value?!";
In terms of connecting the foreign db, this is what I did (in my case the dbs are running on the same instance):
try=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
try=# CREATE SERVER ro_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '127.0.0.1', port '5433', dbname 'real_db');
CREATE SERVER
try=# CREATE USER MAPPING FOR mla SERVER ro_server;
CREATE USER MAPPING
try=# CREATE SCHEMA foreign_schema;
CREATE SCHEMA
# XXX load foreign schema locally
try=# IMPORT FOREIGN SCHEMA public FROM SERVER ro_server INTO foreign_schema;
IMPORT FOREIGN SCHEMA
Note that I imported the schema (but not the data) to the temp database because otherwise I was getting undefined type errors when I tried to do the foreign connection.
With that, I'm now able to run pg_sample against the test database. The output however has the foreign connection schema, so it would require some post-processing to get it to match the original db.
from pg_sample.
Ah, I tried this out so far and it works for me so far (like you mentioned I had to do some post processing to add data types)
Past this point I think it's a matter of modifying the code that's there to select and insert rows into the temporary table into selecting the sample rows but then using a separate connection to insert them into this schema that's been copied over.
Not sure what that would take though.
from pg_sample.
Hm, just tried this again and I noticed that when I try to run it, it errors out saying something about a view? I'd like to ignore the views
DBD::Pg::db do failed: ERROR: relation "public.test" does not exist CONTEXT: column "table_name" of foreign table "test_view" at pg_sample line 300. main::__ANON__("DBD::Pg::db do failed: ERROR: relation \"public.test\" does "..., DBI::db=HASH(0x556be6d11f40), undef) called at pg_sample line 58
from pg_sample.
@mehulp118 You applied the patch ^^^? Not sure why it would be returning views, since we explicitly ask for WHERE table_type NOT IN ('VIEW')
But yes, it seems like it was close to working, except for the naming issue. I'll try to take another look this week.
from pg_sample.
Has there been any progress on this?
from pg_sample.
Related Issues (20)
- Feature Request: Wildcards in relations in limit HOT 1
- Docker: Can't exec "pg_dump" HOT 3
- Docker workflow HOT 7
- Make row ordering deterministic HOT 9
- Is it possibe to use the --limit query based on the contents of one of the sample tables HOT 3
- Could not identify an equality operator for type json HOT 1
- Error using --random parameter HOT 5
- wildcards don't seem to be working in "--limit" HOT 1
- Export data with inserts statements HOT 4
- Version 1.13 missing from github releases HOT 2
- Use of uninitialized value $sample_fk_table in concatenation (.) or string at pg_sample-master/pg_sample line 685 HOT 1
- psql:mini.sql:254159: ERROR: permission denied: "RI_ConstraintTrigger_c_11890996" is a system trigger HOT 3
- Is there a way to export a sample by a table row? HOT 3
- sample_schema table naming greater than 63 characters HOT 10
- ERROR: operator does not exist: json = json at character 35 HOT 1
- psql invalid command \n or \N HOT 5
- docker image can't sample databases newer than postgres 14
- pg_sample doesn't correctly handle generated columns HOT 6
- Query with a list can`t be parsed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pg_sample.