htmfilho / csvsource Goto Github PK
View Code? Open in Web Editor NEWConverts a CSV file to SQL Insert Statements.
Home Page: https://www.hildeberto.com/csvsource/
License: Apache License 2.0
Converts a CSV file to SQL Insert Statements.
Home Page: https://www.hildeberto.com/csvsource/
License: Apache License 2.0
Is your feature request related to a problem? Please describe.
In some cases, not all rows of a CSV file might be useful to import into a database. By using filters, it might be possible to delimiter a subset of data to import.
Describe the solution you'd like
Filters could be logical expressions with END, OR, and NOR, comparing columns and values. Filters are strings passed as argument, to be interpreted as logical expressions. The expressions are used while reading a row to include or exclude it.
Describe alternatives you've considered
When using a staging table, the application can do the filtering there before updating its core tables. But when inserting into a core table directly, we may have issues with unwanted data.
Is your feature request related to a problem? Please describe.
Yes. In many cases, the content in the prefix and suffix files are related to the insert statements. For example, the prefix file may have a script to create a table, but the name of the table should be the same used in the insert statements. Unfortunately, the content of the prefix and suffix files are fixed.
Describe the solution you'd like
It would be interesting if the prefix and suffix files were actually template files, so we could use variables to be replaced by arguments during run-time. Variables would be replaced by arguments, such as --table
and --column
.
Describe alternatives you've considered
An alternative would be to edit the files, put current values, and execute the command, but it would reduce automation opportunities.
Is your feature request related to a problem? Please describe.
Today, the entire code is in a single file (main.rs). As the code grows, it becomes harder and harder to navigate and understand what it does. We want to separate the logic to generate the SQL from the logic to handle the command line.
Describe the solution you'd like
The logic to generate the SQL would be in lib.rs
, making it a library usable by other applications that do not necessarily use the command line. After this change, we can publish Roma to Cargo, allowing it to be a dependence in other projects.
Info:
Is your feature request related to a problem? Please describe.
Some users may need to use the exported sql file in a database migration tool, like Liquibase. Those tools already deal with transactions, so it wouldn't be necessary to include begin transaction
and commit
in the file.
Describe the solution you'd like
We want to make transactions optional. The default behaviour would be to not have transactions when the argument --chunk
is not present. We would add a new argument --transactional
to explicitly indicate we want a transaction embracing all statements. This new argument would become useless in case the argument --chunk
is present, since the whole point of it is to create transaction chunks.
Describe alternatives you've considered
The only alternative to this would be to manually remove the begin transaction
and commit
from the beginning and end of the file.
Is your feature request related to a problem? Please describe.
In case the insert statements point to a table that doesn't support auto-incremented primary keys.
Describe the solution you'd like
Support the option to include an auto-incremented or auto-generated identifier in the insert statements. The user would inform the initial value from which the auto-increment starts or if the identifier is an UUID, and the column of the identifier.
Describe alternatives you've considered
The target table would need to support an auto-incremented or auto-generated identifier. Without this support, either because the database doesn't support it or the table structure can't be changed to support it.
Is your feature request related to a problem? Please describe.
This is a basic feature that is not related to a problem.
Describe the solution you'd like
Read a large CSV file.
Describe alternatives you've considered
If it is too hard to process the file manually, consider a library.
Additional context
Consider loading large files found on the internet to increase the robustness of the solution.
Is your feature request related to a problem? Please describe.
The feature is new and cover cases where the date/time format in the CSV file is not compatible with the database date/time format.
Describe the solution you'd like
Describe alternatives you've considered
There is no alternative to solve this problem in the current version.
Additional context
The date/time format syntax will be inherited from the time library in use.
Is your feature request related to a problem? Please describe.
According to feedback from the community, the function generate_sql
is complex and can be simplified by using an existing library to generate sql.
Describe the solution you'd like
The crate sql_builder
seems to be a good option: https://docs.rs/sql-builder/latest/sql_builder/
Describe alternatives you've considered
The current implementation fulfills the needs of the project. Using sql_builder
could simplify the code, but it can also increase its footprint. A lot of features in that library are not applicable in this project.
Is your feature request related to a problem? Please describe.
The problem is when the table I want to import the data doesn't exist yet. I need to spend time creating it so I can run the insert statements.
Describe the solution you'd like
We can deduct the database to store the CSV data from the CSV itself. The solution would be to figure out the table name, columns, and column types, then generate the DDL script to create the table. The code would also verify if all rows of a column contains or not contain value to figure out if the column is null or not null.
The resulting DDL script would be added to the beginning of the script.
Describe alternatives you've considered
The alternative is to create the DDL script ourselves and add it as the prefix of the file. But this job can be automated.
Is your feature request related to a problem? Please describe.
This is just to check if Roma supports a file format similar to CSV.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.