Git Product home page Git Product logo

Comments (5)

garyelephant avatar garyelephant commented on May 18, 2024

@RickyHuo 请评估3种方案:


方案一:
由于typesafe 的config在加载配置时,未能保留配置的顺序,所以我们只能修改application.conf 中的配置格式从kv的map形式,改为list形式;

# 更改后的配置类似
filter {
    {
        type = split
        ...
    }
    {
        type = grok
        ...
    }
}

另外如果这样修改,如何支持条件判断相关的配置如:if else,考虑通过如下方式支持:

# 更改后的配置类似,通过如下配置实现一个 if else 逻辑:
filter {
    {
        condition = "status >= 400"
        type = split
		...
    }
    {
        condition = "status < 400"
        type = grok
        ...
    }
}

整体感觉,这种配置让人理解起来不够直观和简单,但是实现相对容易。


方案二:寻找其他配置解析库的替代方案,目前只找到了这一个,是否满足需求,还在确认中:
http://www.cfg4j.org/


方案三:使用antlr4自定义一套配置文件解析规则,同时满足配置看起来直观、简单的需求和支持条件判
断(if else)的需求

实现有学习成本和难度,但是最能满足需求(kv形式,plugin保持顺序,有条件判断, Field引用)。
参考:
https://ivanyu.me/blog/2014/09/13/creating-a-simple-parser-with-antlr/
http://progur.com/2016/09/how-to-create-language-using-antlr4.html
https://blog.knoldus.com/2016/05/04/creating-a-dsl-domain-specific-language-using-antlr-part-ii-writing-the-grammar-file/

from seatunnel.

garyelephant avatar garyelephant commented on May 18, 2024

还有1种方案是修改typesafe config的源码,把它存储配置的数据结构改成LinkedHashMap,就能保持顺序。

注意:这样还是不满足能够做条件判断的需求。

from seatunnel.

garyelephant avatar garyelephant commented on May 18, 2024

配置文件parser需求:

  • kv形式

  • plugin保持顺序

  • 有条件判断

  • Field引用

  • 环境变量/自定义变量的在配置中替换(自定义变量的引用方式)

  • 全局配置

  • 配置文件错误提示和定位

from seatunnel.

garyelephant avatar garyelephant commented on May 18, 2024

antlr4 grammer file见:

https://github.com/InterestingLab/waterdrop/blob/garyelephant.fea.configparser/src/main/scala/org/interestinglab/waterdrop/configparser/Config.g4

https://github.com/InterestingLab/waterdrop/blob/garyelephant.fea.configparser/src/main/scala/org/interestinglab/waterdrop/configparser/BoolExpr.g4

配置文件示例见:

input {
  kafka {
    brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
    topic = "accesslog"
  }
}

filter {
  split {
    # default delimeter is whitespace
    fields = ["time", "url", "http_status", "response_time", "refer", "body_size"]
  }

  if ${http_status} >= 500 {
    field {
      action = "add"
      field_name = "internal_error"
      value = 1
    }
  }

  # user defined plugin
  org.apache.mycompany.filters.pagerank {
  }
}

output {

	elasticsearch {
		hosts = ["10.11.110.45:9000", "10.11.110.46:9000"]
		index = "waterdrop-accesslog-${time}"
	}

	if ${http_status} >= 400 AND ${http_status} < 500 {
		kafka {
			brokers = ["10.11.110.35:9092", "10.11.110.36:9092", "10.11.110.37:9092"]
			topic = "user_error"
		}
	}
}

经antlr4解析后,生成如下AST树:

image

根据此AST树和antlr4自动生成的代码进行listener/vistor遍历即可实现配置文件解析功能,解析后的配置转换为typesafe config,供各个插件使用。

需求满足情况如下:

  • 【支持】kv形式

  • 【支持】plugin保持顺序

  • 【支持】有条件判断

  • 【支持】Field引用

  • 【支持但未实现】环境变量/自定义变量的在配置中替换(自定义变量的引用方式)

  • 【支持但未实现】全局配置

  • 【需调研】配置文件错误提示和定位

from seatunnel.

garyelephant avatar garyelephant commented on May 18, 2024

@RickyHuo antlr4方案

from seatunnel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.