jvalue / jayvee Goto Github PK
View Code? Open in Web Editor NEWJayvee is a domain-specific language and runtime for automated processing of data pipelines
Home Page: https://jvalue.github.io/jayvee/
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
Home Page: https://jvalue.github.io/jayvee/
Executing the GasReserves Model leads to an issue with db queries: Could not write to postgres database. syntax error at or near "."
.
As described in https://github.com/jvalue/hub/issues/290 this shows as successful run in the hub but does not create a db. Any error should make the interpreter return exit code 1/crash.
block GasReserveExtractor oftype CSVFileExtractor {
url: "https://www.bundesnetzagentur.de/_tools/SVG/js2/_functions/csv_export.html?view=renderCSV&id=1089590";
}
layout GasReserveLayout {
header row 1: text;
column A: text;
column B: decimal;
column C: decimal;
column D: decimal;
column E: decimal;
column F: decimal;
}
block GasReserveValidator oftype LayoutValidator {
layout: GasReserveLayout;
}
block GasReserveLoader oftype PostgresLoader {
host: requires HUB_DB_HOST;
port: requires HUB_DB_PORT;
username: requires HUB_DB_USERNAME;
password: requires HUB_DB_PASSWORD;
database: requires HUB_DB_DATABASE;
table: requires HUB_DB_TABLE;
}
pipe {
from: GasReserveExtractor;
to: GasReserveValidator;
}
pipe {
from: GasReserveValidator;
to: GasReserveLoader;
}
Yes, I'd add
export const NONE: None = {};
in the file none-io-type.ts
Originally posted by @felix-oq in #126 (comment)
We should further think about the distinction / combination of types for block in-/output in data-types.ts
and the primitive types in AbstractDataType.ts
. But that probably is beyond the scope of this PR.
Originally posted by @felix-oq in #23 (review)
Have a look at mFund data structures and data structures in popular data science libraries like pandas, beam, etc. to steer #33
Add an implementation for the CSVFileExtractor
block to the interpreter. It is supposed to download the CSV file from the given URL and make it available to a following LayoutValidator
block.
This issue came up due to #60 where the delimiter attribute was added to the grammar definition of CSVFileExtractor
. SQLiteLoader
is also affected.
All blocks should allow arbitrary ordering of its attributes. In the grammar, when adding the &
operator to indicate that the attributes shall be unordered, the following errors occur:
Error: Parser definition errors detected:
-------------------------------
../..:5 - Ambiguous Alternatives Detected: <1 ,3> in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#AMBIGUOUS_ALTERNATIVES
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,2> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
-------------------------------
../..:5 - Ambiguous alternatives: <1 ,3> due to common lookahead prefix
in <OR1> inside <BlockType> Rule,
<> may appears as a prefix path in all these alternatives.
See: https://chevrotain.io/docs/guide/resolving_grammar_errors.html#COMMON_PREFIX
For Further details.
- "Radmonitore": https://mobilithek.info/offers/-2042164558385113078
- 11 Fahrrad-Monitore messen Anzahl an vorbeifahrenden Fahrrädern
in Rostock in 15-Minuten-Intervallen
- 123 MB
- Achtung: Insgesamter Zeitraum scheint nicht immer gleich bei
allen Monitoren zu sein (Teilweise 2013-2020, teils nur 2022, ...)
- In einer kleinen CSV-Datei (1,7 KB) sind die Standorte näher beschrieben
- Es sind noch andere, ggfs. spannende, Dateien enthalten
- "Pünktlichkeit": https://mobilithek.info/offers/-3073323302501876779
- Pünktlichkeit von Zuglinien in Prozent beim
Schienenpersonennahverkehr aufgelöst in Jahr und Monat im Raum
Schleswig-Holstein
- 178 KB
- von Jan'2010 bis Jan'2022
We should develop framework that also introduces a generic interface for blocks. When interpreting a model, the framework should execute blocks in their correct order (according to the pipes in the model) and pass the output of a block to the input of the succeeding block.
At some point, the logger should know what it is logging for. So either the block type should be set here or in the logX methods in the actual executor parent class. That way we can build nicer log messages with the addition of the block type.
Originally posted by @rhazn in #95 (comment)
Add an implementation for the PostgresLoader
block to the interpreter. It is supposed to get the table / sheet from a previous LayoutValidator
block and store it in a postgres database. It is fine to hardcode the database connection details for now.
See #112 where the idea came up initially.
Depends on #30
After #58 is done, please release the jayvee version and prepare the hub so it can be deployed with the new gas model support.
The deployment workflow is the following:
package.json
devDependencies
of package.json
and in the installation command in runtime-simple.Dockerfile
Originally posted by @felix-oq in #57 (comment)
A boolean value type should exist on language level.
Setup basic Langium
project.
Everyone to list and prioritize
A basis for all UACs is the corresponding RFC0002 Mobility extension. The RFC got qualified by multiple iterations, which are listed below. One UACs represents a single requirement, extraced from the final, accepted iteration.
Iterations of RFC0002 Mobility extension:
Iteration | Scope | PR |
---|---|---|
1 | RFC mobility extension using collections: inital concept | #111 |
2 | FC mobility extension using collections: refinement after feedback-loop | #115 |
3 | RFC mobility extension using file-pickers: change of concept | #116 |
4 | RFC mobility extension using file-pickers: refinement after feedback-loop | #117 |
5 | RFC mobility extension using file-pickers: RFC-status ACCEPTED | #119 |
File
is implemented. Via #125 and #256FileSystem
is implemented. Via #126 and #256None
is implemented. Via #126Table
stores the table's name. Obsolet via #164 and #165None
, the execution aborts. via #136HttpExtractor
is implemented in std-extension. via #134ArchiveInterpreter
is implemented in std-extension. via #135FilePicker
is implemented in std-extension. via #136CSVFileExtractor
is refactored to an CSVInterpreter
. via #168CSVFileExtractor
is covered by the new HTTPExtractor
and ArchiveInterpreter
. via #169HTTPExtractor
and ArchiveInterpreter
. via #169 and TODOSQLiteSink
accepts multiple inputs. --> for a PoC, multiple sinks are used, rather than multiple inputs for one sink, this works out of the boxSQLiteSink
process the new tables' name. obsolet via #164SQLiteSink
does not recreate an DB each call. --> already implemented, if an database exists, the db gets opened, otherwise created.0002-mobility.jv
via #180A detailled explanation of all UACs and further context is provided by the RFC0002
The current implementation does not consider pipes when interpreting models and arranges blocks in a fixed, hardcoded order by default.
Enable a more generic model interpretation by constructing graphs from models. Blocks shall be considered as nodes and pipes as directed edges between nodes. Then the beginning of pipelines is implicitly specified by blocks that have no input (i.e. extractors) and their end by blocks with no output (i.e. loaders).
jv
command to the CLI
jv -f <path-to-jv-file> -e NAME="VALUE"
interprets the jv file, filling runtime parameters called NAME with VALUE (e.g. postgres connection information)jv -h
shows basic helpjv <path-to-jv-file>
is short for jv -f <path-to-jv-file>
without config parametersAdd an implementation for the LayoutValidator
block to the interpreter. It is supposed to get a CSV file from a previous CSVFileExtractor
block, validate it using the given layout specification and make the resulting table / sheet available to a following PostgresLoader
block.
To ease parallel development with student projects the language should allow
Examples for use:
Part of #85
See 0001-cell-ranges.md
and #118 for details.
Think of a suitable name for the language (memorable names are preferred over descriptive ones) and replace every occurrence of the current name by the new name. Consider all different cases when replacing (i.e. Open Data Language
, OpenDataLanguage
, open-data-language
, odl
).
Could not use latest version as that results in an dynamic import failure (newer versions only support ESM):
Instead change the require of index.js in /open-data-language/out/cli/executors/csv-file-extractor-executor.js to a dynamic import() which is available in all CommonJS modules.
Probably find another library that does the job and is not outdated.
Originally posted by @georg-schwarz in #15 (comment)
As a {language developer},
I want to {test the language on different examples},
to {expand it based on real data}.
A general comment though: 31 changed files for one "real" change seems excessive to review, I think we should:
Originally posted by @rhazn in #52 (review)
E.g. inspired by rust or npm
Also see emberjs, react and vuejs which are inspired by the RFC process of rust
The Bundesnetzagentur publishes easy to parse CSV files here: https://www.bundesnetzagentur.de/DE/Gasversorgung/aktuelle_gasversorgung/_svg/Indikator3_Speicher/Indikator_Speicher.html?nn=1077982
A roughly working model is at the end of this description as "Gas Reserve Model", the file content as "CSV File Content".
Issues:
Could not write to postgres database. syntax error at or near "."
in these queries: CREATE TABLE IF NOT EXISTS Gas (.;Kritisch;Angespannt;Stabil;Speicherstand IST;gesetzliche Ziele text);
, we should make sure those work by adding '' around column names in the postgres.;Kritisch;Angespannt;Stabil;Speicherstand IST;gesetzliche Ziele
01.11.2022;95,0;95,0;100,0;99,19;95,00
02.11.2022;95,0;95,0;100,0;99,30;
03.11.2022;95,0;95,0;100,0;99,26;
04.11.2022;94,9;95,0;100,0;99,26;
05.11.2022;94,9;94,9;100,0;99,38;
06.11.2022;94,8;94,9;100,0;99,54;
07.11.2022;94,6;94,8;100,0;99,55;
08.11.2022;94,4;94,6;100,0;99,56;
09.11.2022;94,2;94,4;100,0;99,62;
10.11.2022;93,9;94,2;100,0;99,72;
11.11.2022;93,7;94,0;100,0;99,75;
12.11.2022;93,4;93,8;100,0;99,89;
13.11.2022;93,1;93,6;100,0;100,00;
14.11.2022;92,7;93,3;100,0;99,95;
15.11.2022;92,4;93,1;100,0;99,94;
16.11.2022;92,0;92,8;100,0;99,98;
17.11.2022;91,6;92,5;100,0;99,90;
18.11.2022;91,2;92,2;100,0;99,80;
19.11.2022;90,7;91,9;100,0;99,68;
20.11.2022;90,3;91,5;100,0;99,55;
21.11.2022;89,7;91,1;100,0;99,38;
22.11.2022;89,2;90,7;100,0;99,26;
23.11.2022;88,6;90,3;100,0;98,95;
24.11.2022;88,0;89,8;100,0;;
25.11.2022;87,4;89,3;100,0;;
26.11.2022;86,8;88,9;100,0;;
27.11.2022;86,2;88,4;100,0;;
28.11.2022;85,5;87,9;100,0;;
29.11.2022;84,9;87,3;100,0;;
30.11.2022;84,2;86,8;100,0;;
01.12.2022;83,5;86,3;100,0;;
02.12.2022;82,8;85,8;100,0;;
03.12.2022;82,1;85,2;100,0;;
04.12.2022;81,4;84,7;100,0;;
05.12.2022;80,7;84,1;100,0;;
06.12.2022;80,0;83,6;100,0;;
07.12.2022;79,3;83,1;100,0;;
08.12.2022;78,6;82,6;100,0;;
09.12.2022;77,9;82,0;100,0;;
10.12.2022;77,2;81,5;100,0;;
11.12.2022;76,5;81,0;100,0;;
12.12.2022;75,7;80,4;100,0;;
13.12.2022;75,0;79,8;100,0;;
14.12.2022;74,2;79,3;100,0;;
15.12.2022;73,5;78,7;100,0;;
16.12.2022;72,8;78,2;100,0;;
17.12.2022;72,1;77,7;100,0;;
18.12.2022;71,4;77,2;100,0;;
19.12.2022;70,7;76,7;100,0;;
20.12.2022;70,0;76,2;100,0;;
21.12.2022;69,3;75,7;100,0;;
22.12.2022;68,6;75,2;100,0;;
23.12.2022;67,9;74,7;100,0;;
24.12.2022;67,2;74,2;100,0;;
25.12.2022;66,4;73,6;100,0;;
26.12.2022;65,7;73,0;100,0;;
27.12.2022;64,9;72,4;100,0;;
28.12.2022;64,1;71,8;100,0;;
29.12.2022;63,3;71,3;100,0;;
30.12.2022;62,5;70,7;100,0;;
31.12.2022;61,8;70,2;100,0;;
01.01.2023;61,1;69,7;100,0;;
02.01.2023;60,5;69,3;100,0;;
03.01.2023;59,8;68,8;100,0;;
04.01.2023;59,2;68,4;100,0;;
05.01.2023;58,6;68,0;100,0;;
06.01.2023;57,9;67,5;100,0;;
07.01.2023;57,3;67,1;100,0;;
08.01.2023;56,6;66,6;100,0;;
09.01.2023;55,9;66,2;100,0;;
10.01.2023;55,3;65,7;100,0;;
11.01.2023;54,6;65,2;100,0;;
12.01.2023;53,9;64,8;100,0;;
13.01.2023;53,3;64,3;100,0;;
14.01.2023;52,6;63,8;100,0;;
15.01.2023;51,9;63,4;100,0;;
16.01.2023;51,2;62,9;100,0;;
17.01.2023;50,5;62,4;100,0;;
18.01.2023;49,8;61,9;100,0;;
19.01.2023;49,1;61,4;100,0;;
20.01.2023;48,4;60,9;100,0;;
21.01.2023;47,6;60,4;100,0;;
22.01.2023;46,9;59,8;100,0;;
23.01.2023;46,1;59,2;100,0;;
24.01.2023;45,3;58,7;100,0;;
25.01.2023;44,6;58,2;100,0;;
26.01.2023;43,9;57,7;100,0;;
27.01.2023;43,2;57,2;100,0;;
28.01.2023;42,5;56,7;100,0;;
29.01.2023;41,9;56,3;100,0;;
30.01.2023;41,3;55,9;100,0;;
31.01.2023;40,6;55,4;100,0;;
01.02.2023;40,0;55,0;100,0;;40,00
02.02.2023;39,4;54,6;100,0;;
03.02.2023;38,8;54,2;100,0;;
04.02.2023;38,1;53,7;100,0;;
05.02.2023;37,4;53,2;100,0;;
06.02.2023;36,7;52,7;100,0;;
07.02.2023;36,0;52,2;100,0;;
08.02.2023;35,2;51,7;100,0;;
09.02.2023;34,5;51,1;100,0;;
10.02.2023;33,7;50,5;100,0;;
11.02.2023;32,9;50,0;100,0;;
12.02.2023;32,1;49,4;100,0;;
13.02.2023;31,4;48,8;100,0;;
14.02.2023;30,7;48,3;100,0;;
15.02.2023;30,0;47,8;100,0;;
16.02.2023;29,3;47,4;100,0;;
17.02.2023;28,7;46,9;100,0;;
18.02.2023;28,1;46,5;100,0;;
19.02.2023;27,6;46,2;100,0;;
20.02.2023;27,0;45,9;100,0;;
21.02.2023;26,5;45,5;100,0;;
22.02.2023;26,0;45,2;100,0;;
23.02.2023;25,4;44,8;100,0;;
24.02.2023;24,9;44,5;100,0;;
25.02.2023;24,3;44,1;100,0;;
26.02.2023;23,8;43,7;100,0;;
27.02.2023;23,2;43,3;100,0;;
28.02.2023;22,6;42,9;100,0;;
01.03.2023;22,0;42,5;100,0;;
02.03.2023;21,4;42,1;100,0;;
03.03.2023;20,8;41,6;100,0;;
04.03.2023;20,2;41,3;100,0;;
05.03.2023;19,7;40,9;100,0;;
06.03.2023;19,2;40,6;100,0;;
07.03.2023;18,7;40,3;100,0;;
08.03.2023;18,3;40,0;100,0;;
09.03.2023;17,8;39,7;100,0;;
10.03.2023;17,5;39,5;100,0;;
11.03.2023;17,1;39,3;100,0;;
12.03.2023;16,7;39,1;100,0;;
13.03.2023;16,4;38,9;100,0;;
14.03.2023;16,1;38,7;100,0;;
15.03.2023;15,7;38,5;100,0;;
16.03.2023;15,4;38,3;100,0;;
17.03.2023;15,0;38,0;100,0;;
18.03.2023;14,6;37,8;100,0;;
19.03.2023;14,2;37,5;100,0;;
20.03.2023;13,8;37,2;100,0;;
21.03.2023;13,4;37,0;100,0;;
22.03.2023;13,0;36,7;100,0;;
23.03.2023;12,7;36,5;100,0;;
24.03.2023;12,4;36,3;100,0;;
25.03.2023;12,1;36,1;100,0;;
26.03.2023;11,9;35,9;100,0;;
27.03.2023;11,7;35,8;100,0;;
28.03.2023;11,5;35,7;100,0;;
29.03.2023;11,4;35,7;100,0;;
30.03.2023;11,3;35,7;100,0;;
31.03.2023;11,3;35,6;100,0;;
01.04.2023;11,2;35,6;100,0;;
block GasReserveExtractor oftype CSVFileExtractor {
url: "https://www.bundesnetzagentur.de/_tools/SVG/js2/_functions/csv_export.html?view=renderCSV&id=1089590";
}
layout GasReserveLayout {
header row 1: text;
column A: text;
column B: decimal;
column C: decimal;
column D: decimal;
column E: decimal;
column F: decimal;
}
block GasReserveValidator oftype LayoutValidator {
layout: GasReserveLayout;
}
block GasReserveLoader oftype PostgresLoader {
host: requires HUB_DB_HOST;
port: requires HUB_DB_PORT;
username: requires HUB_DB_USERNAME;
password: requires HUB_DB_PASSWORD;
database: requires HUB_DB_DATABASE;
table: requires HUB_DB_TABLE;
}
pipe {
from: GasReserveExtractor;
to: GasReserveValidator;
}
pipe {
from: GasReserveValidator;
to: GasReserveLoader;
}
Implement the grammar for our current language draft.
No cli
program and advanced checks yet.
language level concept to group pipes
Only suitable for the first 26 columns. Probably also an issue elsewhere. Should be tackled in a separate issue I think.
Originally posted by @georg-schwarz in #40 (comment)
Currently, the concept to describe CSV files is column-based. We assume that this does not live up to the complexity of CSV files in practice. Thus, we need a concept based on cell-ranges (as a more generic way) to describe CSV files.
The current syntax might be syntactic sugar, e.g. column A: text;
might be a short form for range (A:A): text
.
Tackling this issue might require redesigning the whole sheet concept, as cell ranges might overlap etc.
This is fine and imho we can merge it like this. In a separate MR I'd encourage you to move this to a composite-pattern implementation, with directories and files (see this eloquent example by a well known german professor https://www.youtube.com/watch?v=efDxnGi8zU8&t=60s).
Originally posted by @rhazn in #126 (comment)
Actually the FileSystem follows already a composite approach, but could be more explicit.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.