Comments (5)
@RickyHuo 请评估3种方案:
方案一:
由于typesafe 的config在加载配置时,未能保留配置的顺序,所以我们只能修改application.conf 中的配置格式从kv的map形式,改为list形式;
# 更改后的配置类似
filter {
{
type = split
...
}
{
type = grok
...
}
}
另外如果这样修改,如何支持条件判断相关的配置如:if else,考虑通过如下方式支持:
# 更改后的配置类似,通过如下配置实现一个 if else 逻辑:
filter {
{
condition = "status >= 400"
type = split
...
}
{
condition = "status < 400"
type = grok
...
}
}
整体感觉,这种配置让人理解起来不够直观和简单,但是实现相对容易。
方案二:寻找其他配置解析库的替代方案,目前只找到了这一个,是否满足需求,还在确认中:
http://www.cfg4j.org/
方案三:使用antlr4自定义一套配置文件解析规则,同时满足配置看起来直观、简单的需求和支持条件判
断(if else)的需求
实现有学习成本和难度,但是最能满足需求(kv形式,plugin保持顺序,有条件判断, Field引用)。
参考:
https://ivanyu.me/blog/2014/09/13/creating-a-simple-parser-with-antlr/
http://progur.com/2016/09/how-to-create-language-using-antlr4.html
https://blog.knoldus.com/2016/05/04/creating-a-dsl-domain-specific-language-using-antlr-part-ii-writing-the-grammar-file/
from seatunnel.
还有1种方案是修改typesafe config的源码,把它存储配置的数据结构改成LinkedHashMap
,就能保持顺序。
注意:这样还是不满足能够做条件判断
的需求。
from seatunnel.
配置文件parser需求:
-
kv形式
-
plugin保持顺序
-
有条件判断
-
Field引用
-
环境变量/自定义变量的在配置中替换(自定义变量的引用方式)
-
全局配置
-
配置文件错误提示和定位
from seatunnel.
antlr4 grammer file见:
配置文件示例见:
input {
kafka {
brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
topic = "accesslog"
}
}
filter {
split {
# default delimeter is whitespace
fields = ["time", "url", "http_status", "response_time", "refer", "body_size"]
}
if ${http_status} >= 500 {
field {
action = "add"
field_name = "internal_error"
value = 1
}
}
# user defined plugin
org.apache.mycompany.filters.pagerank {
}
}
output {
elasticsearch {
hosts = ["10.11.110.45:9000", "10.11.110.46:9000"]
index = "waterdrop-accesslog-${time}"
}
if ${http_status} >= 400 AND ${http_status} < 500 {
kafka {
brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
topic = "user_error"
}
}
}
经antlr4解析后,生成如下AST树:
根据此AST树和antlr4自动生成的代码进行listener/vistor遍历即可实现配置文件解析功能,解析后的配置转换为typesafe config,供各个插件使用。
需求满足情况如下:
-
【支持】kv形式
-
【支持】plugin保持顺序
-
【支持】有条件判断
-
【支持】Field引用
-
【支持但未实现】环境变量/自定义变量的在配置中替换(自定义变量的引用方式)
-
【支持但未实现】全局配置
-
【需调研】配置文件错误提示和定位
from seatunnel.
@RickyHuo antlr4方案
from seatunnel.
Related Issues (20)
- [Feature][Elastic search] Support multi-table source feature
- [Bug] [Seatunnel-web] transform:table name not found HOT 3
- [Bug] [Zeta] seatunnel does not seem to reuse database connections HOT 2
- [Bug] [Zeta] savePointJob Doesn't work
- [Bug] [InfluxDB] Can not read data when lower_bound and upper_bound is not config HOT 4
- [Feature][SeaTunnel-Client] Client API Upgrade.
- hive jdbc split multi-thread error reporting HOT 4
- During Hive metastore connection, the underlying NULL data is\N error HOT 3
- [Bug] [Jvm] A fatal error has been detected by the Java Runtime Environment HOT 2
- 从tdengine同步数据到hdfs上一直报连接不上tdengine数据库的错误
- [Bug] [Seatunnel-formats] 读取csv的时候遇到字段里有换行的会被解析成新的一行 导致数据错乱 HOT 2
- [Bug] [connector-hive] hive sink error org.apache.thrift.transport.TTransportException
- [Bug] [Sink] Bug Hive insert error org.apache.spark.sql.execution.QueryExecutionException: Parquet column cannot be converted in file hdfs:/db/xx/dt_mon=xxxx/qwxsadas12321321.parquet. Column: [xxx ], Expected: decimal(12,2), Found: FIXED_LEN_BYTE_ARRAY HOT 8
- [Feature][InfluxDB] source
- [Feature][Transform] Read the data source specified field rules into the corresponding table HOT 1
- [Bug] [Doris Sink] Sink Doris Error
- [Bug] [connector-iceberg] Illegal provider-class name: org.apache.seatunnel.shade.connector-iceberg.org.apache.orc.impl.CryptoUtils$HadoopKeyProviderFactory
- please give a eg from kafka to console use jsonpath
- [Bug] [seatunnel-engine-server] #5191 bugfix bring loss job
- [Bug] [seatunnel-engine-server] slot申请时如果资源不够,已申请成功的资源未释放 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seatunnel.