This repository holds the glsl and glsl-quasiquote projects. Feel free to visit each projects for more information.
hadronized / glsl Goto Github PK
View Code? Open in Web Editor NEWGLSL parser for Rust
GLSL parser for Rust
This repository holds the glsl and glsl-quasiquote projects. Feel free to visit each projects for more information.
Removed FIXME
said:
// FIXME: the three fields are wrong. It’s not possible to have the last two if the second one
// is not Some(_) – see page 197 of the GLSLangSpec.4.50.pdf document.
Currently, the reported errors are weak and useless. We need a way to locate them and have more information.
Hint: have a look at the
verbose-errors
compilation feature. It seems to be the fastest way to
achieve what we need.
The following shader:
layout(set = 0, binding = 0) buffer Foo {
char a;
} foo;
Yields this:
Done(
[],
[
Declaration(
Block(
TypeQualifier {
qualifiers: [
Layout(
LayoutQualifier {
ids: [
Identifier(
"set",
Some(
Comma(
IntConst(
0
),
Assignment(
Variable(
"binding"
),
Equal,
IntConst(
0
)
)
)
)
)
]
}
),
Storage(
Buffer
)
]
},
"Foo",
[
StructFieldSpecifier {
ty: TypeName(
"char"
),
identifiers: [
"a"
]
}
],
Some(
(
"foo",
None
)
)
)
)
]
)
I'm not sure what the actual output is supposed to be, but it seems weird that binding
is inside the block of set = 0
. There might be a bug here.
Currently, the implementation is very naive. As I was writing tests to test the whole thing, I came across a nasty issue: it just doesn’t work. It might be very inefficient, also. I need a state machine.
The idea of that state machine is to have “cells” that represents the current context. That context states what was parsed and what is to be expected then. For instance:
foo[3]
There are several ways to parse that. The idea is that the parser should go from left to right. Hence, in the context of expecting an expression, we need to check:
The current implementation will try to match all the forms of expressions, while we could start off by trying to match a prefix operator. If it fails, then we try an infix form. If it fails, we abort. If it succeeds, we try the suffix operator. Etc.
According to @Geal, it’d be a good idea to use a FSM over little, small nom parsers, and I think it’s kinda the same idea – so good point there.
I currently disabled them from being used in the grammar because it causes the whole function identifiers and expression to turn left-recursive.
void main(void){
return;
}
parses to
Ok(
TranslationUnit(
NonEmpty(
[
FunctionDefinition(
FunctionDefinition {
prototype: FunctionPrototype {
ty: FullySpecifiedType {
qualifier: None,
ty: TypeSpecifier {
ty: Void,
array_specifier: None,
},
},
name: Identifier(
"main",
),
parameters: [
Unnamed(
None,
TypeSpecifier {
ty: Void,
array_specifier: None,
},
),
],
},
statement: CompoundStatement {
statement_list: [
Simple(
Declaration(
InitDeclaratorList(
InitDeclaratorList {
head: SingleDeclaration {
ty: FullySpecifiedType {
qualifier: None,
ty: TypeSpecifier {
ty: TypeName(
TypeName(
"return",
),
),
array_specifier: None,
},
},
name: None,
array_specifier: None,
initializer: None,
},
tail: [],
},
),
),
),
],
},
},
),
],
),
),
)
Instead of a Declaration
the return should parse as a Return
I was just interested to know why you are not using the nom::IResult
type directly and prefer using a custom version of it, named ParseResult
?
Putting #version 450
as the first line of the shader makes the parser return Error(Many1)
.
The current implementation uses primary_expr
as for the left part of the .
. This is wrong. It should be postfix_expr
.
Currently, Declaration::Block
is a long variant with several arguments. This is boring, we need a struct for that.
This is needed to do semantic analysis and translation to spirv.
Do you have any thoughts on how you'd like to represent it?
glslang builds the symbol table during parsing and refers to it in the resulting parse tree. That has the advantage of not needing to represent both an unresolved syntax tree and a syntax tree with resolved symbols.
e.g.
float f()
{
return 1e-6;
}
gives:
float f()
^
expected ';', found f
1: at line 3, in Alt:
float f()
^
2: at line 0, in Many1:
^
Now that I have at least one GLSL writer, I can a very interesting property:
parse(show(ast)) == ast
This property is very interesting, because if we can generate random ASTs, we can have automatic testing for free. However, I don’t have random generation of AST yet. Some work must be done there.
It’s rare that I do that but I think I will incorporate this change in a minor patch. For a single reason: the sole change from GLSL450 is that the compiler now accepts extra semicolons at global scope.
Accepting GLSL450 with that change is, to me, not a problem and if it becomes to anyone, I will add a feature flag to protect against that. But I really doubt it will ever be as it’s already the case. From my idea, I think that change is to allow starting a shader with ;
(which sounds completely weird).
Also, the changelog from Khronos shows that it was reported from a Private Bug. I have no idea what it means but whatever.
I do this so that we can get going on with rust-gamedev/wg#23.
That type would ensure the grammar is respected:
I want to join existing partial glsl code with one that's built programatically.
let existing_code = CompoundStatement::parse(
"
r = t;
f = h;
",
)
.unwrap();
I'm then adding it to an existing CompoundStatement
that has programatically built everything else.
compound
.statement_list
.extend(&existing_code.statement_list);
let external_declaration = ExternalDeclaration::new_fn(
TypeSpecifierNonArray::Void,
"main",
Vec::new(),
compound.statement_list,
);
let translation_unit = TranslationUnit::from_iter(vec![external_declaration]).unwrap();
The issue, the CompoundStatement::parse()
fails unless the string is surrounded with {}
.
Maybe CompoundStatement
is the wrong type, in which case could you direct me to the correct type for parsing existing code that could be within a function (e.g no declarations).
I tested Statement
and Expr
but those seem to be for one liners and I'd have to separate each line into its own string before parsing?
The following code will trigger a parse error on the current release / HEAD:
vec3[3] verts = vec3[]( /* whatever */ );
Here, the vec3[](…)
is a function call, and the function identifier must be vec3[]
. Currently, it’ll fail to parse.
This is not easy to fix because the function call parser is already defined in the postfix expression parser as an alternative. We have to be smart to fix that.
The AST does not currently contain line numbers. It would be handy to have for reporting semantic analysis problems.
This will help with both developing and error reporting.
CI could git clone the piglit project, and have a little parser that runs over glslparser and shader_test files (other than compile-failure ones) and runs them through our parser.
The following does not give an error when parsing when it should:
int fetch_transform(int id)
{
return id;
}
bool ray_plane()
{
if 1 {
}
Instead it gives back an syntax tree containing only the first function.
I'm guessing this is just confusion on my part:
When trying to parse a shader with a #define
declaration, the parser exists with an error.
extern crate glsl;
const SOURCE: &str = r#"
#define X 1
void main() {
}
"#;
use glsl::parser::Parse;
fn main() {
let res = glsl::syntax::TranslationUnit::parse_str(SOURCE);
println!("{:?}", res);
}
Results in
Err(ParseError { kind: Many1, info: "" })
Documentation on Preprocessor says #define
is supported by substitution. Implementation seems to disagree.
Am I missing a step or am I misinterpreting the docs?
It’s currently pub
for convenience when generating the documentation as I’m experimenting around with pest
.
The current implementation uses identifiers to represent those type names while the spec uses TYPE_NAME
without even defining what the heck it is.
Some benchmarks are needed to ensure what the problem is, but I’m pretty sure (given the current nom-3
implementation) that we have a lot of failures and retries.
The current ExternalDeclaration::new_fn
doesn’t do that check yet.
In order to fix this, #47 must be considered first.
I ask this question knowing that it's beyond the scope of this crate. It just seems like the best to ask in case someone else is wondering the same thing.
Is there a project to use this crate to generate type-safe bindings for communication with OpenGL shaders? If not, have you thought about if that is feasible/what that might look like?
I ask this because of your work on luminance-derive and on this crate. Thanks in advance!
The current layout produces poor documentation,
the module parser.rs holds the external interface and the nom
rules. It is pretty cumbersome to read.
A solution would be to separate the external interface from the nom
rules, the question is: what is the external interface? would the following be enough?
pub fn parse(source: &[u8]) -> ParseResult<TranslationUnit>
We want to support the following syntax:
#define FOO(x, y) (x + y)
quasiquote's tokenize_block
produces a block containing a fields: glsl::syntax::NonEmpty(vec![fields])
, where the base glsl crate expects just fields: vec![fields]
, resulting in an error:
| |_________expected struct `std::vec::Vec`, found struct `glsl::syntax::NonEmpty`
| in this macro invocation
|
= note: expected type `std::vec::Vec<glsl::syntax::StructFieldSpecifier>`
found type `glsl::syntax::NonEmpty<glsl::syntax::StructFieldSpecifier>`
Happy to send a patch to correct this, but I'm not clear which is the desired structure - should we be adding a NonEmpty node to quasiquote, or removing it from glsl?
Would be nice to support the following too:
#define FOO( x, y ) ( x + y )
(Mind the spaces inside the braces!)
I implemented the first, naive (yet fully working) GLSL writer in less than 12 hours. I think it’s worth it to write a SPIR-V writer as well, and it shouldn’t take too much time.
It’s not my own priority right now (because I don’t use vulkan nor GL4.6 yet) but if someone provides me with a fully working patch, I’ll accept it for sure.
Any input with #if or #ifdef fails to parse with an error of ErrorKind Custom(0).
eg:
use glsl::parser::{Parse, ParseResult};
use glsl::syntax::TranslationUnit;
fn main() {
let fs = "#define USE_GLOBAL_COLOR 1
uniform vec4 color;
out vec4 out_color;
void main() {
#if USE_GLOBAL_COLOR
out_color = color;
#else
out_color = vec4(1., 0., 0., 1.);
#endif
}";
let parsed = match TranslationUnit::parse_str(fs){
ParseResult::Ok(parsed) => parsed,
ParseResult::Incomplete(_needed) =>
panic!("More data needed to parse shader"),
ParseResult::Err(err) =>
panic!("Error parsing shader: {}", err)
};
}
panics with error Custom(0). removing the ifs makes the parser work correctly
Expr <- AssExpr | Expr , AssExpr
AssExpr <- CondExpr | UnaExpr AssOp AssExpr
CondExpr <- LOrExpr | LOrExpr ? Expr : AssExpr
LOrExpr <- LXorExpr | LOrExpr \|\| LXorExpr
LXorExpr <- LAndExpr | LXorExpr ^^ LAndExpr
LAndExpr <- IOrExr | LAndExpr && IOrExpr
IOrExpr <- EOrExpr | IOrExpr \| EOrExpr
EOrExpr <- AndExpr | EOrExpr ^ AndExpr
AndExpr <- EqExpr | AndExpr & EqExpr
EqExpr <- RelExpr | EqExpr == RelExpr | EqExpr != RelExpr
RelExpr <- ShiftExpr | RelExpr < ShiftExpr | RelExpr > ShiftExpr | RelExpr ≤ ShiftExpr | RelExpr ≥ ShiftExpr
ShiftExpr <- AddExpr | ShiftExpr << AddExpr | ShiftExpr >> AddExpr
AddExpr <- MultExpr | AddExpr + MultExpr | AddExpr - MultExpr
MultExpr <- UnaExpr | MultExpr * UnaExpr | MultExpr / UnaExpr | MultExpr % UnaExpr
UnaExpr <- PostExpr UnaOp
PostExpr <- PrimExpr | PostExpr [ IntExpr ] | FunCall | PostExpr . FieldSel | PostExpr ++ | PostExpr --
PrimExpr <- IDENTIFIER | INTCONST | UINTCONST | FLOATCONST | BOOLCONST | DOUBLECONST | ( Expr )
// and FunCall has a FunIdentifier, which has a PostExpr in it…
I’m about to release glsl-3.0
but there are still some things that make me uncomfortable. The current situation with the preprocessor is a bit uncertain, as we are typing it (e.g. here, here). In my opinion, we should only use String
here as at the stage of preprocessing, GLSL types don’t really exist yet.
The main issue I have with that is linked to the actual usefulness of such a representation. Given the parsed AST, I wonder how easy it is to actually preprocess the AST:
fn preprocess(ast: TranslationUnit) -> Result<TranslationUnit, PreprocessorError>;
Maybe we can add that function to see how things are actually going on. In the meantime, it’s likely that I change those Expr
to String
.
We need fuzzer support. Plus, each time the fuzzer finds a bad case, we need to include it as a dedicated file in tests/fuzz/
and include_bytes!
it to enhance the unit tests.
in vec4 col;
void main() {
gl_Position = col - col - col;
}
parses as:
Assignment(
Variable(
Identifier(
"gl_Position",
),
),
Equal,
Binary(
Sub,
Variable(
Identifier(
"col",
),
),
Binary(
Sub,
Variable(
Identifier(
"col",
),
),
Variable(
Identifier(
"col",
),
),
),
),
),
which transpiles to:
in vec4 col;
void main() {
gl_Position = (col)-((col)-(col));
}
which has the wrong associativity.
The character \
should be supported to span a multiline.
The following shader:
buffer Foo {
char tiles[];
} main_tiles;
void main() {
}
Gives me this:
Done(
[
70,
111,
111,
32,
123,
10,
32,
32,
32,
32,
99,
104,
97,
114,
32,
116,
105,
108,
101,
115,
91,
93,
59,
10,
125,
32,
109,
97,
105,
110,
95,
116,
105,
108,
101,
115,
59,
10,
10,
118,
111,
105,
100,
32,
109,
97,
105,
110,
40,
41,
32,
123,
10,
125,
10
],
[
Declaration(
Global(
TypeQualifier {
qualifiers: [
Storage(
Buffer
)
]
},
[]
)
)
]
)
In other words the parsing stops when encountering the []
and doesn't process what is afterwards.
This doesn’t parse:
void main() {
float a = 1. * .5;
}
In nom, a parser can fail by either:
alt
parser combinator for instance (e.g. it tests one parser; if it fails, it tests the next one in the tuple). glsl
uses such parsers a lot.glsl
domain. For instance, a sequence of parser requires all parsers to succeed. Any parser failing put the whole parser in an unrecoverable state. Unrecoverable parsers are neat because they can short-circuit all remaining branches.As a motivation for this issue, syntax::Expr
parsers have been written by following the GLSL450 specification strictly, without optimizing the nom
code with recoverable / unrecoverable parsers in mind. The resulting code implies a lot of parsers being recovered over and over, and over, and… over. Especially, this situation occurs in glsl
:
fn some_parser(i: &str) -> ParserResult<_> {
// here we have an alternative, which means we can recover subparsers
alt((
foo,
bar
))(i)
}
fn foo(i: &str) -> ParserResult<_> {
terminated(zoo, quux)(i)
}
fn bar(i: &str) -> ParserResult<_> {
terminated(zoo, other_parser)(i)
}
As you can see, we are going to try foo
and if it fails, we’ll try bar
. Both foo
and bar
will fail if we cannot parse with zoo
. The problem is that zoo
’s parent recovers it, so everytime bar
suceeds, it means two things:
foo
has failed, which means it failed to parse quux
while succeeded to parse zoo
.bar
will parse with zoo
again.bar
will run zoo
twice, everytime.The goal of this issue is to optimize those by turning them in a form such as:
fn some_parser(i: &str) -> ParserResult<_> {
// we know all parsers in the alternative will require zoo, so parse it first
terminated(
zoo,
alt((
foo,
bar
))
)(i)
}
fn foo(i: &str) -> ParserResult<_> {
quux(i)
}
fn bar(i: &str) -> ParserResult<_> {
other_parser(i)
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.