Comments (12)
Thanks for the quick response - I kicked of another dump + restore to fix some other issues (related to my own transformers), but I will post more later in the day
from greenmask.
๐ so fast, thanks a bunch! I can confirm that with the latest version, memory usage stays at a constant ~200mb, just like it did with pg_restore
๐ป
from greenmask.
It might be memory leakage. Greenmask does not upload the whole table dump into memory, it writes this data by batch. So I suspect there might be only one problem - uncontrolled buffer growth.
Could you provide please:
- Greenmask version
- PostgreSQL version
- Table definition example (I will try to emulate your case)
- Do large columns contain a kind of json/html/xml?
from greenmask.
Were there important log lines before greenmask process was killed?
from greenmask.
For excluding that it is a pg_restore
issue. If you using directory storage try to restore the data using pg_restore
pg_restore -U postgres -h localhost --jobs 10 /storage_path/1708598401362987000
Replace your connection string and actual backup id (1708598401362987000).
If it is restored correctly then greenmask has a problem with memory allocation somewhere
from greenmask.
Were there important log lines before greenmask process was killed?
Nope, just "restoring table", and then after a minute or so killed
from greenmask.
It might be memory leakage. Greenmask does not upload the whole table dump into memory, it writes this data by batch. So I suspect there might be only one problem - uncontrolled buffer growth.
Could you provide please:
Greenmask version
0.1.5
PostgreSQL version
15.5
Table definition example (I will try to emulate your case)
postgres@127:app> \d "mealPlanTemplatesI18nOwned"
โโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโโโโโโโโคโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Column โ Type โ Modifiers โ
โโโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโโโโโโโโโโโโชโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโก
โ id โ uuid โ not null โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ locale โ locale โ not null โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ content โ text โ not null โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ frontPageFileId โ uuid โ โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ createdAt โ timestamp with time zone โ not null default statement_timestamp() โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ updatedAt โ timestamp with time zone โ โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ name โ character varying(255) โ โ
โโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโงโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Indexes:
"mealPlanTemplatesI18nOwned_pkey" PRIMARY KEY, btree (id, locale)
Foreign-key constraints:
"mealplantemplatesi18nowned_id_foreign" FOREIGN KEY (id) REFERENCES "mealPlanTemplatesOwned"(id) ON UPDATE CASCADE ON DELETE CASCADE
Inherits: "mealPlanTemplatesI18n"
Do large columns contain a kind of json/html/xml?
Yes, they contain HTML
from greenmask.
I ran another restore, and the issue happened again, altough on a different table this time. The table doesn't have any big text columns like the previous one, but it is 8GB in size.
I was able to pg_restore
the data without issue, so there seem to be a bug somewhere in greenmask.
This does mean that for now I can let greenmask handle the dump + anonymization, and then restore with pg_restore
๐บ
from greenmask.
Were there important log lines before greenmask process was killed?
Nope, just "restoring table", and then after a minute or so
killed
2024-02-23T06:29:40Z DBG ../home/runner/work/greenmask/greenmask/internal/db/postgres/restore.go:377 > restoring objectName="table \"public\".\"meal_plan_meal\"" pid=37084 workerId=1
2024-02-23T06:29:40Z DBG ../home/runner/work/greenmask/greenmask/internal/db/postgres/restorers/table.go:64 > performing pgcopy statement copyStmt="COPY \"public\".\"meal_plan_meal\" (\"id\", \"meal_id\", \"meal_plan_id\", \"meal_group_id\", \"is_locked\", \"sort_weight\", \"created_at\", \"ignore_restrictions\", \"profileId\", \"scale\", \"ignore_macro_splits\", \"ignore_calories\", \"updatedAt\") FROM stdin" pid=37084
Killed
from greenmask.
Thank you!
I've started to investigate this problem. I will try to fix this bug as soon as possible.
from greenmask.
The problem is found. I will let you know once the fix is released.
from greenmask.
@janmeier Thank you one more time for reporting this bug.
It is fixed in v0.1.6 now.
from greenmask.
Related Issues (20)
- Epic: Implement dynamic parameters for trasnformers
- Epic: Determninistic transformations
- Feat: RandomPerson transformer implementation HOT 1
- Epic: V0.2b release
- Feat: RandomIp transformer implementation
- Feat: Documentation deployment with multiversion support
- Greenmask V0.1.13 SIGSEGV HOT 6
- Bug: --data-only flag interfere with --schema-only
- doc: Review documentation for v0.2 release
- feat: Add type validation for dynamic parameters encoders
- feat: Database subset
- feat: unique transformations
- Feat: RandomMacAddress transformer implementation
- permission denied for large object during dump action HOT 10
- fix: Enrich dynamic parameter validation warning
- feat: Set min and max values not required for int values
- feat: Implement LargeObjects inclusive and exclusive list
- feat: Noise* transformers - allow empty min or max params
- Feature request: transformer "timestamp with time zone" HOT 3
- fix: validate --table option wrong shcema and table parsing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from greenmask.