korap / koral Goto Github PK
View Code? Open in Web Editor NEW:pencil: Translation of query languages to serialized KoralQuery protocol
License: BSD 2-Clause "Simplified" License
:pencil: Translation of query languages to serialized KoralQuery protocol
License: BSD 2-Clause "Simplified" License
This is moved from the old Trac Ticket #202
jbingel reported:
Currently, the foundry, layer and key constraints in spans are serialised differently than in tokens. For spans, <mate/c=np> is serialised as
{
"@type":"korap:span",
"foundry":"mate",
"layer":"c",
"key":"np"
}
whereas for tokens, [mate/p=N] is wrapped in a term:
{
"@type":"korap:token",
"wrap": {
"@type":"korap:term",
"foundry":"mate",
"layer":"p",
"key":"N"
}
}
There is a reason for this, namely that the constraints in tokens can be combined by ANDs and ORs using a korap:termGroup. That's not possible (yet?) for spans.
A reason against this approach is to minimise the specifications. A korap:span would then have a significantly lower number of attributes, as they're outsourced to korap:term, which can hold the attributes foundry/layer/key already. So the span serialisation would be like this:
{
"@type":"korap:span",
"wrap": {
"@type":"korap:term",
"foundry":"mate",
"layer":"c",
"key":"np"
}
}
Any more pros/cons before we decide whether to put this into effect?
Maybe we should really support the sort of logical connection for span key that we already have for tokens, i.e. allow something like
<cnx/c=vp | cnx/c=advp>
Then we'd definitely need a similar mechanism as in tokens, naturally using term/termGroup in a "wrap" attribute, liko so:
{
"@type":"korap:span",
"wrap": {
"@type":"korap:termGroup",
"operation":"operation:or",
"operands":[ {
"@type":"korap:term",
"foundry":"cnx",
"layer":"c",
"key":"vp"
}, {
"@type":"korap:term",
"foundry":"cnx",
"layer":"c",
"key":"advp"
} ]
}
Possibly problematic: syntactic ambiguity between attributes (root=true) and layer/key definitions (c=NP).
Akron replied:
But effectively this is identical to <cnx/c=vp>|<cnx/c=advp>, right? So I don't see at least any benefit for Poliqarp+. Regarding spans this might work - although deserialization becomes a bit more complicated (especially with negative matches).
I don't see why requestMap and queryMap (such as in the C2 serialization) should be LinkHashMaps. Does the order of the attributes in KoralQuery important?
The Cosmas-II Query "Schiff+ahrt" is correctly interpreted as a placeholder in a wildcard, as described here. However, the placeholder +
is not rewritten to ?
, which is a failure.
koral:span may have key, foundry, layer, and value attributes.
In the former specification, these attributes are specified in the same level as type.
{ "@type": "koral:span", "layer": "c", "foundry": "corenlp", "match": "match:eq", "key": "VP" }
However, in the Koral doc, they should be included in a wrap and specified as a term.
{ "@type": "koral:span", wrap: { "@type": "koral:term", "layer": "c", "foundry": "corenlp", "match": "match:eq", "key": "VP" } }
Regular expression should be possible for foundry/layer value in Annis query, e.g. tt/l=/A.*/
, that is similar to poliqarp query [tt/p="A.*"]
. Thus, the serialization should also be the same.
"query": { "@type": "koral:token", "wrap": { "@type": "koral:term", "foundry": "tt", "key": "A.*", "layer": "p", "match": "match:eq", "type": "type:regex" } }
Cosmas II now supports a wide range of operators to find Composites. I don't know if we can support them directly, as these information need to be part of the index (So restricted to Glemm probably).
Dominance with type operator should be serialized into a relation (dominance) with attribute (the specified type). Currently it is serialized as an AND relation between the dominance and the type. For example, the dominance operator in this query
corenlp/c="VP" & corenlp/c="NP" & #1 >[malt/d="PP"] #2
is serialized into
"operation": "operation:relation", "relation": { "@type": "koral:relation", "wrap": { "@type": "koral:termGroup", "operands": [ { "@type": "koral:term", "foundry": "malt", "key": "PP", "layer": "d", "match": "match:eq" }, { "@type": "koral:term", "layer": "c" } ], "relation": "relation:and" } }
Annis lemma keyword has not been supported yet.
Report types in the Koral specification are not very elaborated. There is no good explanation how to use it and which types of operations are supported.
Kustvakt currently uses:
operation:injection
operation:deletion
operation:override
operation:insertion
In addition, Kalamar also accepts operation:modification
.
We should discuss and define the different rewrite operations here and how they should be integrated in KoralQuery.
The scope
was originally introduced for deleted items, but is used for other types and comments now. It should be defined, what is meant to be in scope.
This is a reaction to KorAP/Kalamar#51.
Currently we use the term collection
to describe the construction query of a virtual corpus - for historical reasons, when we used the term "virtual collection". I think it's better to rename collection
to corpus
, as a collection is a very general term to describe a set (so it could also name a collection of matches, a collection of spans etc.).
The command line implementation seems to be disabled at the moment.
The command
java -jar target/Koral-0.25.jar "der alte" "Poliqarp"
results in nothing.
Currently Koral transforms sequences with intermediate any tokens into distances, like der [] Mann
into distance(+1, der, Mann)
. This works, and is fine with Krill. Unfortunately Koral does the same for distances with optional anchors, like der [] Mann?
. The Problem here is, that with a distance, the meaning is altered. Without a distance, the query means: "Find a sequence of two tokens, with the first being 'der', followed by 'Mann'". With a distance the query means "Find a span with a single token distance between 'der' and 'Mann', where 'Mann' is optional". I think, in case of optionality on one or both sides, the any token shouldn't be rewritten to a distance..
Some annotations have associated confidence values (e.g. Treetagger). Krill supports attached confidence values for terms, spans and relations encoded as a byte (values between 0 and 255). There should be a mechanism in KoraQuery to constrain matches to minimum confidence values. This constraint should also be expressible in Poliqarp+.
Proposal:
[cnx/p=NN@!>=70%]
This would constrain matches of cnx/p=NN to all terms with a confidence greater than 70%. The @
symbol introduces attributes to terms/spans/relations, normally written as key-value pairs. The !
would mark a special attribute name for confidence. This may only be a shortcut for a more elaborate attribute name.
Currently Koral will always create KoralQuery objects with "collection" : {}
, in case no collection was set. This is not a valid collection object and will make all queries of that type fail.
Another problem are empty warnings, errors and messages. They are not wrong but annoying in case no error, warning or message was set.
Often Koral blames the user making a mistake, while the query is just confusing but not wrong. Here we should collect queries that are failing and that fail in weird way:
Frage/1
in PoliqarpThe MORPH() operator in Cosmas-II supports negation, to exclude certain morphosyntactical annoations from the result set, like MORPH(mate/p=ADV)
.
Currently this throws a parsing error in Koral.
wegen #IN <s>
is serialized into:
{ "@context": "http://korap.ids-mannheim.de/ns/koral/0.3/context.jsonld", "query": { "classRefCheck": ["classRefCheck:includes"], "operation": "operation:class", "operands": [{ "operation": "operation:position", "frames": [], "operands": [ { "operation": "operation:class", "operands": [{ "wrap": { "@type": "koral:term", "layer": "orth", "match": "match:eq", "key": "wegen" }, "@type": "koral:token" }], "@type": "koral:group", "classOut": 130 }, { "operation": "operation:class", "operands": [{ "wrap": { "@type": "koral:term", "key": "s" }, "@type": "koral:span" }], "@type": "koral:group", "classOut": 129 } ], "@type": "koral:group" }], "@type": "koral:group", "classOut": 131, "classIn": [ 129, 130 ] } }
operation:position with empty frames does not really make sense. This should possibly be reduced to operation:class only, but it does not allow multiple operands.
Besides, wegen #IN(%) <s>
should be serialized with "classRefCheck": ["classRefCheck:disjoint"]
.
While implementing a draft for the exclude
option in operation:position
, I had the assumption that the Koral description
If true, negate positional relations.
is not correct. In fact, I would expect, in case someone searches for, e.g., containsNot(<dereko/s=s>, [orth=Baum])
, it is not only a flip of the frames, meaning it should find a <dereko/s=s>
that contains something that is not [orth=Baum]
, but it should find <dereko/s=s>
, that does not contain [orth=Baum]
.
In case we agree on that interpretation, I would say, this should be a different operation operation:exclusion
, also with frames, but with the result being the span of the first operand only.
To make working with defined VCs (as implemented by @margaretha) more flexible, I propose an additional corpus/collection object type koral:docGroupRef
, that references a defined VC in the corpus/collection query.
{
"@type" : "koral:docGroupRef",
"ref" : "https://korap.ids-mannheim.de/@ndiewald/MyCorpus"
}
The reference is a unique ID that will be resolved by the KoralQuery consumer (in our case Kustvakt) - by injecting the requested KoralQuery fragment in the corpus definition (or by injecting a stored ID, in case the VC is persistant in the backend as a list of text IDs, see our discussion on dynamic vs. persistant VCs). Therefore defined VCs can also be part of more complex VCs.
In the frontend's VC builder the reference can be shown and created like "referTo @ndiewald/MyCorpus" along with a ...
-symbol (in addition to x
, and
and or
). When the user clicks on ...
the reference is resolved and shown in the frontend.
In the Koral VC language we would need to introduce a syntax like referTo {URL}
.
Update: Removed desc and type from the example, and replaced @ref
with ref
after comments by @margaretha.
Currently #ELEM(base/s=s)
is serialized as
"query": {
"@type": "koral:span",
"attr": {
"@type": "koral:term",
"foundry": "base",
"key": "s",
"layer": "s",
"match": "match:eq"
}
}
instead of
"query": {
"@type": "koral:span",
"wrap": {
"@type": "koral:term",
"foundry": "base",
"key": "s",
"layer": "s"
}
}
That's pretty bad, and means all #ELEM() queries fail.
This issue was moved from Trac ticket #204
Akron reported:
Elena pointed me to the use of variables in Poliqarp, I wasn't aware of. It's not really easy to support them, but we could try, in case we find some time ... ;)
This has no high priority - I think we are quite busy with our current tasks - but it should be documented here.
Variables work pretty much like captures in PCRE, and are used for things like agreement.
Specification of grammatical classes and grammatical categories may contain variables (having the form $n, where n is a single digit), whose values will be set only during execution of the query. For example, the following query for an adjective and a following noun agreeing in case:
[case=nom & pos=adj] [case=nom & pos=subst] | [case=gen & pos=adj] [case=gen & pos=subst] |
[case=dat & pos=adj] [case=dat & pos=subst] | [case=acc & pos=adj] [case=acc & pos=subst] |
[case=inst & pos=adj] [case=inst & pos=subst] | [case=loc & pos=adj] [case=loc & pos=subst] |
[case=voc & pos=adj] [case=voc & pos=subst]
can be simplified to:
[case=$1 & pos=adj] [case=$1 & pos=subst]
Source: (โhttp://nkjp.pl/poliqarp/help/ense3.html#x4-90003.4)
How could we achieve this? Here's a simple idea:
We serialize the case to a korap:term with "type":"type:reference" (as this works pretty much like class references - but not refering to the same span but to the same surface). The term will have a number for key.
In the Lucene index we would need a SpanCapturedTermQuery? (Or a regex query in a way), that searches for "foundry/layer=case:/.+?/" and captures the term as a variable in a computed payload (like 1#case:acc - however, we have to keep in mind that we need to have a minimal payload length here to not interfere with other payloads). Now - we need "somewhere"(tm) a wrapper around the query, checking that there are no terms set in the payloads that are contradictory, i.e. if we have a reference term "1#case:acc" and another reference "1#case:acc" everyhing is fine - but the span is no match if we have a reference "1#case:acc" and a reference "1#case:dat".
The wrapper could wrap the query as a whole (cheap solution) or could be a bit more clever - wrapped around the subquery that contains all references, e.g. in "contains(<s>, [case=$1][case=$2])" the SpanTermReferenceCheck? could be wrapped around the sequence, so the results could be filtered before the costly SpanWithinQuery? would have to deal with it.
What do you think?
jbingel replied:
Interesting, I wasn't aware of this either. It's not in the official Poliqarp documentation.
This strongly resembles the "equalvalue" (==) operator in Annis. This was introduced in a fairly recent version of the language (3.1.6) and we decided not to support it because of its recency and technical challenge. Now, I wanted to read about it again, but the detailed explanation that was given in the Annis 3.1.6 documentation disappeared in the current 3.2.2 documentation. Oddly, the 3.1.6 documentation has been taken off the net and is not included in the release. Aargh!
Anyway, given Poliqarp variables and AQL "(not)equalvalue" have the same behaviour, it might be worthwhile to support them. Nils' suggestion sounds relatively easy on the serialisation level, so I'm up for it. We'd just make sure to be able to express negative equality, too (e.g. [case=$1][case!=$1]), but that could probably done in the usual way, i.e. using "match:ne", right?
Akron replied:
Damn - negation! I didn't think about that ... my solution doesn't cover that. I would have to think about it further. But for CoralQuery? I guess it's fine to use match:ne in that case.
Please report your Annis findings here in the ticket. I can't remember it correctly.
The error codes (in the 3xx range) should be documented and localizable. There is a status code map (de.ids_mannheim.korap.query.serialize.util), but for every error there is a different error message. Error messages should be identical for each error code and further information (character position etc.) should be added in the KoralQuery error messages in a position > 2.
Currently #ELEM(base/s=s)
serializes to
{
"attr": {
"layer": "s",
"match": "match:eq",
"foundry": "base",
"key": "s",
"@type": "koral:term"
},
"@type": "koral:span"
}
I think, base/s=s
needs to be the span definition.
The #reg() operator was newly introduced to Cosmas2 and is described in the documentation.
At the moment, alignment is only defined by a list of class numbers, which may not be enough. For example
baum ^ {1:test}?
Isn't serializable at the moment, as the first class may not be part of the result. This should be fixed by having a more robust serialization mechanism.
This issue is moved from Trac (old number is #214)
Currently group operations are passed as @id to the operation key of a koral:group type. That makes koral:group a huge spec. It would be better I guess to have operations defined as parametric objects. Consumers could simply be backcompatible by seeing a object is passed instead of a string. Unfortunately there is no way to keep Koral backcompat.
This would be a huge improvement for the spec descriptions as well.
Pointing relations are incorrectly interpreted to work in the following manner:
node & node & #2 ->antecedent[malt/d="PP"] #1
The serialization ignores the relation type (antecedent) and defines the annotation from the specified relation label.
A correct typed pointing relation query is for example
pos=/P.*/ & pos=/V.FIN/ & #2 ->dep[func="sbj"] #1
where dep is a relation type and [func="sbj"] is an additional label similar to that in the dominance operator.
To be more detailed, foundary and layer should also be defined in the type.
pos=/P.*/ & pos=/V.FIN/ & #2 ->malt/d="PP"[func="sbj"] #1
Suggestion for the serialization:
"operation": "operation:relation", "relation": { "@type": "koral:relation", "wrap": { "@type": "koral:term", "foundry": "malt", "key": "PP", "layer": "d", "match": "match:eq" } "attr": { "@type": "koral:term", "key": "func:sbj", "match": "match:eq" } }
Currently, attr object is not allowed in koral:relation and should probably has the same foundary as the relation.
Theoretically layer and foundry can also be specified for attr in KoralQuery. But it is kind of awkward in AQL since it is supposed to be a type-value pair and not a term with annotation.
pos=/P.*/ & pos=/V.FIN/ & #2 ->malt/d="PP"[malt/d=func="sbj"] #1
Currently strings can not be passed to Koral when they contain special characters.
So [orth=http://spiegel.de]
does not work and [orth="http://spiegel.de"]
only kind of works - as it translates the string to a regular expression. I would propose [orth='http://spiegel.de']
to be a way to pass verbatim string values. Currently this construct translates to regular expressions as well, but I don't think this is necessary.
Due to the support of value flags like /i
and /x
, these characters (and their capital counterparts) are not valid layer names in Poliqarp, e.g. [mate/i=Baum]
throws a parsing error, while [mate/b=Baum]
is fine.
I created a branch escape-regex
including a failing test regarding regexes in Poliqarp containing escaped symbols.
The PQ+-Query
"a\."
is interpreted as
.
which is wrong, I guess.
In Annis, it is possible to specify regex as the annotation value of a relation, e.g
node ->malt/d[func=/D.*/] node
Failed tests: testCollectionQueryDuplicateThrowsAssertionException(de.ids_mannheim.korap.query.serialize.CollectionQueryDuplicateTest)
Whenever QuerySerializer.toJson() is called twice in a row, the private merge collection merges the original requestMap collection segment and the result from the collection query processor. The second call however leads to the nodes being already identical. They are then wrongly merged. Failing test is in dupColl branch.
Currently, koral:doc
fields are type:string
by default. Strings support the operations eq
, ne
, contains
and containsnot
. However - the meaning of contains
and containsnot
differs depending on the underlying field in the database: In text fields contains
and containsnot
is now treated as a phrase query with tokens. In strings we thought of treating it as a regular expression with dot-star-circumfix. The problem:
a) The frontend may know if a field is a string or a text field, but Koral doesn't, so in the KoralQuery serialization of the corpus query, this needs to be unspecified.
b) The backend needs to check, if a field is stored as a string or as a text prior to formulating the query, which is - at least - unelegant.
c) contains
and containsnot
work totally different depending on the field implementation in the backend, which is bad when the user is not aware of it (because there is no difference in the frontend for these field types)
d) contains
and containsnot
is redundant for strings, when regular expressions are supported.
I propose to introduce a text
type for koral:doc
that supports contains
and containsnot
. The string
type will no longer support contains
and containsnot
. Unspecified koral:doc
fields are treated as type:text
by default. The description in the Koral documentation needs to be rephrased to talk about "subsequences" instead of "substrings".
This is a rather complicated design problem so I would like to hear your ideas.
Cosmas II grammars are written in Antlr 3 language which is quite different from Antlr 4. The tree outputs would probably also be different.
Other QLs (Poliqarp, Annis, FCSQL) use Antlr 4. Thus, we use both Antlr 3 and 4 libraries, which fortunately do not conflicted with each other.
Dominance is serialized as a relation with the layer c and without a key, that is not a valid koral:term object.
Suggestion:
node & node & #2 > #1
node & node & #2 ->dominance #1
could be serialized identically as a relation with key dominance.
"operation": "operation:relation", "relation": { "@type": "koral:relation", "wrap": { "@type": "koral:term", "key": "dominance" } }
Here I would propose a new improved error format, that is more in line with KoralQuery report types.
{
"msg" : [{
"@type" : "koral:msg",
"type" : "message:warning",
"src" : "Krill",
"value" : "No query given",
"code" : 700,
"param": []
}]
}
The benefit of such a scheme would be:
When using regular expressions in corpus queries, some characters are failing to parse. I think, this is similar to the behaviour in Poliqarp, where regex-parsing was fixed by reading the characters verbatim instead of parsing.
I added a failing test in 9de87c2 (branch fix-vc-regex).
Currently Koral uses both operation:insertion
and operation:injection
to describe the same kind of rewrite. Although it's not yet formally specified in the KoralQuery doc, I would use only one term. In addition to operation:injection
, operation:modification
should be used for modifications in the KoralQuery.
The current context should be:
http://korap.ids-mannheim.de/ns/koral/0.3/
Argument MAX in C2 groups together matches where multiple hits occur within the same span. For instance,
let Y contains X1, X2, X3, then X #IN(N,MAX) Y
returns only 1 match.
Such grouping can be achieved with operation:merge
with only one operand. This operation (in Krill) should check if the span still have the same start and end position and collects the payloads containing classes of the hits (Xs). Or is any of the classRefOp able to do this?
Serialization of wegen #IN(N,MAX) <s>
would be
{ "@context": "http://korap.ids-mannheim.de/ns/koral/0.3/context.jsonld", "query": { "operation": "operation:merge", "operands": [{ "operation": "operation:position", "frames": [ "frames:isWithin", "frames:matches" ], "operands": [ { "operation": "operation:class", "@type": "koral:group", "classOut": 1, "operands": [{ "wrap": { "@type": "koral:term", "layer": "orth", "match": "match:eq", "key": "wegen" }, "@type": "koral:token" }] }, { "wrap": { "@type": "koral:term", "key": "s" }, "@type": "koral:span" } ], "@type": "koral:group" }], "@type": "koral:group" } }
Cosmas II now supports Quantifiers for morph(). As this is a feature similar to Poliqarp+ it's rather trivial to support, I guess (already part of KoralQuery). But it has to be adopted by Koral.
There seems to be a bug in Koral, and it's illustrated with the query that has been sometimes used as an example: "so" & "nicht" & #1 .0,6 #2
-- the minimal distance can't be zero, so the query should read "so" & "nicht" & #1 .1,6 #2
, with the interpretation that it has when you currently run it with a "0".
(It might be that the indirect precedence operator .*
suggests that "0" would be OK here, but it's apparently meant to associate with the wildcard (rather than the Kleene star). If you say "confusing" and point at the Kleene star that Annis uses in regexes, I'll have to agree...)
query: []{3}
should generate a koral group
{@type:koral:group,
operation:operation:repetition,
operands:[{@type:koral:token}],
boundary:{@type:koral:boundary,min:3,max:3}
}
Currently, it only generates "query":{"@type":"koral:token"}
See de.ids_mannheim.korap.query.serialize.PoliqarpPlusQueryProcessorTest.testEmptyTokens()
In collections, sometimes warnings are raised by the assumption that a value is a date. This is sometimes completely confusing (s. below) and sometimes wrong, as document identifiers may look like dates.
Failing example test:
@Test
public void testNotDate() throws JsonProcessingException, IOException {
collection = "author=\"firefighter1974\"";
qs.setQuery(query, ql);
qs.setCollection(collection);
res = mapper.readTree(qs.toJSON());
assertEquals("koral:doc", res.at("/collection/@type").asText());
assertEquals("author", res.at("/collection/key").asText());
assertEquals("firefighter1974", res.at("/collection/value").asText());
assertEquals("match:eq", res.at("/collection/match").asText());
assertEquals(res.at("/errors/0/0").asText(), "");
assertEquals(res.at("/warnings/0/0").asText(), "");
}
The MIN and MAX attributes of C2-QL allow for grouping of matches that occur in the same context. After discussions and clarifications by Franck, we thought about different ways to serialize that. We aggreed, that we will need classes to point to the context relevant for the merge. So we may want to stick to operation:merge
as a group operation wrapping the query and a classIn
for the relevant context.
@margaretha mentioned the problem that sometimes the context may change (for distance operations with any order, if I understand that correctly). How can we deal with that?
When parsing a query, foundry and layer information is found and serialized to KoralQuery. To avoid reparsing to get these information, a separate array containing all foundry/layers requested in the query should be part of KoralQuery.
This information is useful for at least two features:
This information may be listed in the meta
section of KoralQuery.
Failed
so /+w1 nicht
http://korap.ids-mannheim.de/kalamar?q=so+%2F%2Bw1+nicht&collection-name=&collection=&ql=cosmas2&cutoff=1
"so" /+w1 nicht
http://korap.ids-mannheim.de/kalamar?q=%22so%22+%2F%2Bw1+nicht&collection-name=&collection=&ql=cosmas2&cutoff=1
Works:
"so" /+w1 "nicht"
http://korap.ids-mannheim.de/instance/wiki?q=%22so%22+%2F%2Bw5+%22nicht%22&collection-name=&collection=&ql=cosmas2&cutoff=1
so /+w1 "nicht"
http://korap.ids-mannheim.de/kalamar?q=so+%2F%2Bw1+%22nicht%22&collection-name=&collection=&ql=cosmas2&cutoff=1
I don't think "match:containsnot" fits into the wording scheme of KoralQuery, that's why I propose "match:without" instead.
There seems to be a massive whitespace character handling problem in the collection builder, so whitespaces are removed even in collection constraint values. This leads to massive problems with, e.g. Author names or similar complex data types.
The failing test is in https://github.com/KorAP/Koral/tree/specialcharacterfix branch.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.