z7zmey / php-parser Goto Github PK
View Code? Open in Web Editor NEWPHP parser written in Go
Home Page: https://php-parser.com
License: MIT License
PHP parser written in Go
Home Page: https://php-parser.com
License: MIT License
$ echo -e '<? \004' > test.php && php-parser test.php
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x5473f8]
goroutine 6 [running]:
github.com/z7zmey/php-parser/errors.NewError(...)
/home/imuli/src/github.com/z7zmey/php-parser/errors/error.go:20
github.com/z7zmey/php-parser/php7.(*Parser).Error(0xc42006c180, 0xc420014220, 0x1d)
/home/imuli/src/github.com/z7zmey/php-parser/php7/parser.go:52 +0x68
github.com/z7zmey/php-parser/php7.(*yyParserImpl).Parse(0xc420091500, 0x6d3760, 0xc42006c180, 0x0)
yaccpar:253 +0x4da06
github.com/z7zmey/php-parser/php7.yyParse(0x6d3760, 0xc42006c180, 0xc4200a8000)
yaccpar:153 +0x58
github.com/z7zmey/php-parser/php7.(*Parser).Parse(0xc42006c180, 0xc42000c028)
/home/imuli/src/github.com/z7zmey/php-parser/php7/parser.go:69 +0x11a
main.parserWorker(0xc42001a180, 0xc42001a1e0)
/home/imuli/src/github.com/z7zmey/php-parser/main.go:80 +0x37
created by main.main
/home/imuli/src/github.com/z7zmey/php-parser/main.go:32 +0x136
This happens with chars 1-8, 11,12,14-31. PHP itself gives a warning of unexpected character, but continues parsing the file.
The parser is not as performant as it could be (PHP7+ also creates AST in order to execute file):
$ php -n -r '$start = microtime(true); require("some_big_file.php"); echo "time: " . (microtime(true) - $start) . " sec\n";'
time: 0.0185 sec
$ go run test.go some_big_file.php
Errors count: 0
Parser time: 167.636796ms
I made a few (very hacky) patches that significantly reduce allocations count for some critical parts and it reduced parsing time by a factor of 1.5:
# after patches
$ go run test.go some_big_file.php
Errors count: 0
Parser time: 109.522377ms
There are still plenty more allocations that profiler shows, my patches are just proof-of-concept.
speedup.patch.txt
The class position field covers only the word "class" when parsing with PHP 5.
<?php
class Foo {
private $bar;
}
==> classTest.php
| [*stmt.StmtList]
| "Position": Pos{Line: 2-2 Pos: 7-11}; # should be 2-4, 7-34
| "Stmts":
| [*stmt.Class]
| "Position": Pos{Line: 2-2 Pos: 7-11}; # should be 2-4, 7-34
| "NamespacedName": Foo;
| "PhpDocComment": ;
| "ClassName":
| [*node.Identifier]
| "Position": Pos{Line: 2-2 Pos: 13-15};
| "Value": Foo;
| "Stmts":
| [*stmt.PropertyList]
| "Position": Pos{Line: 3-3 Pos: 20-32};
| "Modifiers":
| [*node.Identifier]
| "Position": Pos{Line: 3-3 Pos: 20-26};
| "Value": private;
| "Properties":
| [*stmt.Property]
| "Position": Pos{Line: 3-3 Pos: 28-31};
| "PhpDocComment": ;
| "Variable":
| [*expr.Variable]
| "Position": Pos{Line: 3-3 Pos: 28-31};
| "VarName":
| [*node.Identifier]
| "Position": Pos{Line: 3-3 Pos: 28-31};
| "Value": bar;
Allow disabling saving node Meta and Position to speed up parsing process
When dealing with truncated or otherwise incomplete php files, sometimes the file ends inside a block.
function incomplete() {
return something();
for some block types - class
, switch
- it comes out fine, just a syntax error and dropping the entry. For the majority though, - do
, foreach
, function
, if
, namespace
, while
- the parser returns nil
and php-parser
crashes.
I haven't yet seen an unclosed parenthetical break the parser, but that seems like a possibility too.
Can you change the logo for this project? It looks like the gopher ate the elephant. I would recommend changing it to a gopher playing with an elephant.
I'm having issues with the namespace resolver. It contains unresolved names like void
, true
and null
. Shouldn't these be removed when they are not resolved
<?php
declare(strict_types=1);
namespace App\Domain\Handler\Cart;
use SimpleBus\Message\Recorder\RecordsMessages;
use App\Domain\Command\ChangeCurrencyCommand;
use App\Domain\Repository\CartRepository;
use App\Domain\Event\CurrencyChangedEvent as CurrencyChangedEventWithAlias;
class ChangeCurrencyHandler
{
/**
* @var CartRepository
*/
private $cartRepository;
/**
* @var RecordsMessages
*/
private $eventRecorder;
public function __construct(
CartRepository $cartRepository,
RecordsMessages $eventRecorder
) {
$this->cartRepository = $cartRepository;
$this->eventRecorder = $eventRecorder;
}
public function __invoke(ChangeCurrencyCommand $command) : void
{
if (true === $command->getBool()) {
// Do something
}
if (null !== $command->getNull()) {
// Do something
}
$this->eventRecorder->record(new CurrencyChangedEventWithAlias());
}
}
package main
import (
"fmt"
"github.com/z7zmey/php-parser/php7"
"github.com/z7zmey/php-parser/visitor"
"os"
"reflect"
)
func main() {
for _, file := range os.Args[1:] {
fmt.Printf("Checking %s\n", file)
checkFile(file)
}
}
func checkFile(file string) {
src, err := os.Open(file)
if err != nil {
panic(err)
}
parser := php7.NewParser(src, file)
parser.Parse()
for _, e := range parser.GetErrors() {
fmt.Println(e)
}
nsResolver := visitor.NewNamespaceResolver()
parser.GetRootNode().Walk(nsResolver)
for n, fqcn := range nsResolver.ResolvedNames {
fmt.Printf("Found %s: %s\n", reflect.TypeOf(n), fqcn)
}
}
Checking ./test.php
Found *name.Name: SimpleBus\Message\Recorder\RecordsMessages
Found *name.Name: App\Domain\Command\ChangeCurrencyCommand
Found *name.Name: void
Found *name.Name: true
Found *name.Name: null
Found *name.Name: App\Domain\Event\CurrencyChangedEvent
Found *stmt.Class: App\Domain\Handler\Cart\ChangeCurrencyHandler
Found *name.Name: App\Domain\Repository\CartRepository
I'm finding some files with things like this:
/*// comment
commented_out_source()
*/
The scanner checks the previous rune for '*' and then the current for '/', but starts with the current rune immediately after the /*
- so it closes the comment immediately.
I suspect that adding in a c = l.Next()
before the loop in scanner/scanner.l:297 would fix this, but I'm not sure this is the best solution and also not certain how to go about generating scanner.go from that - go generate
doesn't seem to work.
Parser sometimes gives a lot of strange errors, see below. When I parse file using 1 goroutine then it works just fine.
Function that does the parsing does not rely on any global state:
func parse(filename string) {
fp, err := os.Open(filename)
if err != nil {
log.Fatalf("Could not open file %s: %s", filename, err.Error())
}
defer fp.Close()
var b bytes.Buffer
conv := transform.NewReader(fp, charmap.Windows1251.NewDecoder())
parser := php7.NewParser(io.TeeReader(conv, &b), filename)
parser.Parse()
for _, e := range parser.GetErrors() {
fmt.Printf("ERROR: parsing %s: %s", filename, e)
}
rootNode := parser.GetRootNode()
if rootNode == nil {
log.Printf("Could not parse %s at all due to errors", filename)
return
}
rootNode.Walk(&rootWalker{
w: os.Stdout,
filename: filename,
comments: parser.GetComments(),
positions: parser.GetPositions(),
lines: bytes.Split(b.Bytes(), []byte("\n")),
})
}
Errors example:
syntax error: unexpected T_ENCAPSED_AND_WHITESPACE at line 409
syntax error: unexpected '}' at line 480
syntax error: unexpected T_STRING, expecting T_VARIABLE or T_ENCAPSED_AND_WHITESPACE or T_DOLLAR_
OPEN_CURLY_BRACES or T_CURLY_OPEN at line 605
...
Again, the php7 parser produces sane output. This one looks like it's stemming from expr.Variable.
Sorry not to be submiting patches with these, I've never touched yacc before and am finding the parser a bit hard to follow. If I find more of these should I continue opening new bugs or just reopen this one with more info?
<?php
$here->where();
==> method_call.php
| [*node.Root]
| "Position": Pos{Line: 2-2 Pos: 19-21}; # ought to be 7-21
| "Stmts":
| [*stmt.Expression]
| "Position": Pos{Line: 2-2 Pos: 19-21}; # ought to be 7-21
| "Expr":
| [*expr.MethodCall]
| "Position": Pos{Line: 2-2 Pos: 19-20}; # ought to be 7-20
| "Variable":
| [*expr.Variable]
| "Position": Pos{Line: 2-2 Pos: 7-20}; # ought to be 7-11
| "VarName":
| [*node.Identifier]
| "Position": Pos{Line: 2-2 Pos: 7-11};
| "Value": here;
| "Method":
| [*node.Identifier]
| "Position": Pos{Line: 2-2 Pos: 14-18};
| "Value": where;
| "ArgumentList":
| [*node.ArgumentList]
| "Position": Pos{Line: 2-2 Pos: 19-20};
This is php code taken from php-src:
INPUT:
interface Serializable
{
function serialize();
function unserialize($serialized);
}
class ArrayObject implements IteratorAggregate, ArrayAccess, Countable
{
const STD_PROP_LIST = 0x00000001;
const ARRAY_AS_PROPS = 0x00000002;
function __construct($array, $flags = 0, $iterator_class = "ArrayIterator") {/**/}
function uasort(mixed cmp_function) {/**/}
/** Sort the entries by key using user defined function.
*/
function uksort(mixed cmp_function) {/**/}
}?>
OUTPUT:
syntax error: unexpected T_STRING, expecting T_VARIABLE at line 20
syntax error: unexpected T_STRING, expecting T_VARIABLE at line 24
File Out:
<?php
interface Serializable
{
function serialize();
function unserialize($serialized);
}
{
}
{
};?>
While backslash-newline doesn't have any special meaning inside a string in PHP,
it is syntatically valid. Currently parsing somethng like
<?php
echo "/ --- \
| foo |
\ --- /" . "\n";
echo '/ --- \
| bar |
\ --- /' . "\n";
yields syntax errors on the second string
==> multi_line_strings.php
syntax error: unexpected $unk at line 5
syntax error: unexpected T_DEC, expecting T_STRING at line 7
| [*node.Root]
| "Position": Pos{Line: 2-7 Pos: 7-83};
| "Stmts":
| [*stmt.Echo]
| "Position": Pos{Line: 2-4 Pos: 7-44};
| "Exprs":
| [*binary.Concat]
| "Position": Pos{Line: 2-4 Pos: 12-43};
| "Left":
| [*scalar.String]
| "Position": Pos{Line: 2-4 Pos: 12-36};
| "Value": "/ --- \
| foo |
\ --- /";
| "Right":
| [*scalar.String]
| "Position": Pos{Line: 4-4 Pos: 40-43};
| "Value": "\n";
| [*stmt.Expression]
| "Position": Pos{Line: 7-7 Pos: 79-83};
| "Expr":
| [*scalar.String]
| "Position": Pos{Line: 7-7 Pos: 79-82};
| "Value": "\n";
rather than two valid strings
==> /home/imuli/src/github.com/imuli/semantic-php/snippets/multi_line_strings.php
| [*node.Root]
| "Position": Pos{Line: 2-7 Pos: 7-83};
| "Stmts":
| [*stmt.Echo]
| "Position": Pos{Line: 2-4 Pos: 7-44};
| "Exprs":
| [*binary.Concat]
| "Position": Pos{Line: 2-4 Pos: 12-43};
| "Left":
| [*scalar.String]
| "Position": Pos{Line: 2-4 Pos: 12-36};
| "Value": "/ --- \
| foo |
\ --- /";
| "Right":
| [*scalar.String]
| "Position": Pos{Line: 4-4 Pos: 40-43};
| "Value": "\n";
| [*stmt.Echo]
| "Position": Pos{Line: 5-7 Pos: 46-83};
| "Exprs":
| [*binary.Concat]
| "Position": Pos{Line: 5-7 Pos: 51-82};
| "Left":
| [*scalar.String]
| "Position": Pos{Line: 5-7 Pos: 51-75};
| "Value": '/ --- \
| bar |
\ --- /';
| "Right":
| [*scalar.String]
| "Position": Pos{Line: 7-7 Pos: 79-82};
| "Value": "\n";
Anything after __halt_compiler();
is not parsed (or compiled) by PHP, and attempting to parse beyond it only invites syntax errors from trying to parse non-PHP.
Thus parsing something like
<?php
__halt_compiler();
"nothing to see here";
shouldn't produce
==> halt_compiler.php
| [*node.Root]
| "Position": Pos{Line: 2-3 Pos: 7-47};
| "Stmts":
| [*stmt.HaltCompiler]
| "Position": Pos{Line: 2-2 Pos: 7-24};
| [*stmt.Expression]
| "Position": Pos{Line: 3-3 Pos: 26-47};
| "Expr":
| [*scalar.String]
| "Position": Pos{Line: 3-3 Pos: 26-46};
| "Value": "nothing to see here";
but rather, something more like
==> halt_compiler.php
| [*node.Root]
| "Position": Pos{Line: 2-3 Pos: 7-24};
| "Stmts":
| [*stmt.HaltCompiler]
| "Position": Pos{Line: 2-2 Pos: 7-24};
or maybe including the stuff afterward either in a simple wrapper or within the HaltCompiler
statement?
Position
struct already saves node positions and lines, but some editors require line and column info.
I noticed that there are Expr
and Expression
names in the node field naming. Is there any difference between them?
token.Token
objects are used only by the parser and I think can be reused with sync.Pool
When parsing
hi bye
I get
==> plane_15.php
syntax error: unexpected $unk at line 1
| [*node.Root]
| "Position": Pos{Line: 1-1 Pos: 1-12};
| "Stmts":
| [*stmt.InlineHtml]
| "Position": Pos{Line: 1-1 Pos: 1-3};
| "Value": hi ;
| [*stmt.InlineHtml]
| "Position": Pos{Line: 1-1 Pos: 9-12};
| "Value": bye
;
rather than
==> plane_15.php
| [*node.Root]
| "Position": Pos{Line: 1-1 Pos: 1-12};
| "Stmts":
| [*stmt.InlineHtml]
| "Position": Pos{Line: 1-1 Pos: 1-12};
| "Value": hi bye
;
The character in there is U+F0004, in Supplemental Private Use Area-A, commonly used with custom fonts for rendering charactcer like things in text on the web.
I'll submit a pull request with the fix, which simply seperates EOF from other uncategorized characters in the classifier.
Consider the code below:
Not how the parser doesn't realize that int
, bool
and true
are global constants, instead we get: "NamespacedName": Test\bool;
etc, which is obviously wrong.
<?php
declare(strict_types=1);
namespace Test;
class Test
{
public static function isValid(int $typeid): bool
{
return true;
}
}
[*node.Root]
"Stmts":
[*stmt.Declare]
"Consts":
[*stmt.Constant]
"PhpDocComment": ;
"ConstantName":
[*node.Identifier]
"Value": strict_types;
"Expr":
[*scalar.Lnumber]
"Value": 1;
"Stmt":
[*stmt.Nop]
[*stmt.Namespace]
"NamespaceName":
[*name.Name]
"Parts":
[*name.NamePart]
"Value": Test;
[*stmt.Class]
"NamespacedName": Test\Test;
"PhpDocComment": ;
"ClassName":
[*node.Identifier]
"Value": Test;
"Stmts":
[*stmt.ClassMethod]
"ReturnsRef": false;
"PhpDocComment": ;
"MethodName":
[*node.Identifier]
"Value": isValid;
"Modifiers":
[*node.Identifier]
"Value": public;
[*node.Identifier]
"Value": static;
"Params":
[*node.Parameter]
"ByRef": false;
"Variadic": false;
"VariableType":
[*name.Name]
"NamespacedName": Test\int;
"Parts":
[*name.NamePart]
"Value": int;
"Variable":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": typeid;
"ReturnType":
[*name.Name]
"NamespacedName": Test\bool;
"Parts":
[*name.NamePart]
"Value": bool;
"Stmt":
[*stmt.StmtList]
"Stmts":
[*stmt.Return]
"Expr":
[*expr.ConstFetch]
"Constant":
[*name.Name]
"NamespacedName": Test\true;
"Parts":
[*name.NamePart]
"Value": true;
Yacc convig variables yyDebug
and yyErrorVerbose
must be set once.
It would be very useful to be able to understand which types can be present in node.Node properties.
For example, I initially thought that (*expr.Variable).VarName
can only be *node.Identifier
but very much later I saw that it is not always the case as there exist "variable variables".
Type comments can be created automatically using actual type information when analyzing some big codebase. I may volunteer for that if you do not have plans to implement it yourself.
I discovered today that the parser fails with an "index out of range" runtime error when it encounters the following PHP code:
<?php
$things = ["foo", "bar"];
list(, $bar) = $things;
Surprisingly, this is valid PHP code (running it results in bar
).
php-parser gets really unhappy about the lack of a first argument though:
panic: runtime error: index out of range
I almost feel bad reporting this because it's such bad PHP code, but it's nonetheless valid and should probably at least not crash the parser.
Non-ASCII symbols are not parsed correctly in comments (file encoding is UTF-8).
<?php
$a = 1; // тестовый коммент
$b = 2;
[*stmt.StmtList]
"Stmts":
[*stmt.Expression]
"Expr":
[*assign.Assign]
"Variable":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": a;
"Expression":
[*scalar.Lnumber]
"Value": 1;
[*stmt.Expression]
"Comments":
"// \u0080\u0080\u0080\u0080\u0080\u0080\u0080\u0080 \u0080\u0080\u0080\u0080\u0080\u0080\u0080\n"
"Expr":
[*assign.Assign]
"Comments":
"// \u0080\u0080\u0080\u0080\u0080\u0080\u0080\u0080 \u0080\u0080\u0080\u0080\u0080\u0080\u0080\n"
"Variable":
[*expr.Variable]
"Comments":
"// \u0080\u0080\u0080\u0080\u0080\u0080\u0080\u0080 \u0080\u0080\u0080\u0080\u0080\u0080\u0080\n"
"VarName":
[*node.Identifier]
"Comments":
"// \u0080\u0080\u0080\u0080\u0080\u0080\u0080\u0080 \u0080\u0080\u0080\u0080\u0080\u0080\u0080\n"
"Value": b;
"Expression":
[*scalar.Lnumber]
"Value": 2;
Printer
must be able to skip non-modified nodes. It will avoid unnecessary changes
It looks like the problem is with scanner/lexer.go:465.
file := token.NewFileSet().AddFile(fName, -1, 1<<31-1)
Replacing 1<<31-1
with 1<<31-3
fixes the problem (the base
of a FileSet
starts at 1, and adds size+1
to account for EOF).
The "proper" fix would seem to be
fInfo, err := os.Stat(fName)
if err != nil {
panic(err)
}
file := token.NewFileSet().AddFile(fName, -1, int(fInfo.Size())
Or passing the file size in from outside. Thoughts?
Error below:
panic: token.Pos offset overflow (> 2G of source code in file set)
goroutine 6 [running]:
go/token.(*FileSet).AddFile(0x1a56a150, 0x1a5143c0, 0x40, 0x1, 0x7fffffff, 0x0)
/nix/store/0b91dwiap82wpar5b225bs8wig8c7xva-go-1.9.2/share/go/src/go/token/position.go:380 +0x291
github.com/z7zmey/php-parser/scanner.NewLexer(0x8d56b10, 0x1a50c178, 0x1a5143c0, 0x40, 0x1a5143c0)
/home/imuli/src/github.com/z7zmey/php-parser/scanner/lexer.go:465 +0x5a
github.com/z7zmey/php-parser/php7.NewParser(0x8d56b10, 0x1a50c178, 0x1a5143c0, 0x40, 0x0)
/home/imuli/src/github.com/z7zmey/php-parser/php7/parser.go:32 +0x3d
main.parserWorker(0x1a514200, 0x1a514240)
/home/imuli/src/github.com/z7zmey/php-parser/main.go:71 +0xf3
created by main.main
/home/imuli/src/github.com/z7zmey/php-parser/main.go:30 +0xd9
[*binary.Mul]
"Left":
[*binary.Plus]
"Left":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": a;
"Right":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": b;
"Right":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": c;
Currently, above AST is printed incorrectly: $a + $b * $c
.
It must group Plus
expression and print ($a + $b) * $c
PHPDoc is sometimes attributed to a wrong node. Example:
<?php
/** Some phpdoc */
$a = 1;
function wrong_phpdoc() {}
It yields the following structure:
[*node.Root]
"Stmts":
[*stmt.Expression]
"Expr":
[*assign.Assign]
"Variable":
[*expr.Variable]
"VarName":
[*node.Identifier]
"Value": a;
"Expression":
[*scalar.Lnumber]
"Value": 1;
[*stmt.Function]
"NamespacedName": wrong_phpdoc;
"ReturnsRef": false;
"PhpDocComment": /** Some phpdoc */;
"FunctionName":
[*node.Identifier]
"Value": wrong_phpdoc;
"Stmts":
As you can see, PHPDoc here is attributed to a function while the correct position would be the assignment.
What I expected
When using the package simply labelled `printer*, I assumed it would just print the file with all the formatting it had previously, however it seems this is just meant to be a pretty printer.
What's the plan for retaining formatting, if any? My use case is that I'd like to make something that'll resolve all my PHP namespaces in Sublime Text when you hit save.
For saving the data back out, I'm sure I could do a sort of hack where I only modify the lines affected, but I'm hoping retention of formatting is do-able and not too difficult so I can avoid that effort
At the very least, can the Printer
struct perhaps just be renamed to PrettyPrinter
?
Issue Description EDITED
abstract class AbstractClass
{
// Force Extending class to define this method
abstract protected function getValue();
abstract protected function prefixValue($prefix);
// Common method
public function printOut()
{
print $this->getValue() . "\n";
}
}
class ConcreteClass1 extends AbstractClass
{
protected function getValue()
{
return "ConcreteClass1";
}
public function prefixValue($prefix)
{
return "{$prefix}ConcreteClass1";
}
}
$class1 = new ConcreteClass1;
$class1->printOut();
echo $class1->prefixValue('FOO_') . "\n";
// console output
{
abstract protected function getValue();
abstract protected function prefixValue($prefix);
public function printOut()
{
print($this->getValue() . "\n");
}
}
class ConcreteClass1 extends AbstractClass
{
protected function getValue()
{
return "ConcreteClass1";
}
public function prefixValue($prefix)
{
return "$prefixConcreteClass1";
}
}
$class1 = new ConcreteClass1;
$class1->printOut();
echo $class1->prefixValue('FOO_') . "\n";
Expected outputl:
ConcreteClass1
FOO_ConcreteClass1
Actualy:
ConcreteClass1
<?php
<<<CAT
TEST
CAT;
| [*stmt.StmtList]
| "Position": Pos{Line: 3-5 Pos: 11-23};
| "Stmts":
| [*stmt.Expression]
| "Position": Pos{Line: 3-5 Pos: 11-23};
| "Expr":
| [*scalar.Heredoc]
| "Position": Pos{Line: 3-5 Pos: 11-22};
| "Label": CAT;
| "Parts":
| [*scalar.EncapsedStringPart]
| "Position": Pos{Line: 4-4 Pos: 15-19};
| "Value": TEST
;
PHP Usually throws a Fatal Error when an abstract method contain a body, even if its not used. but PHP-parser does not.
example :
<?php
namespace Foo;
abstract class Bar extends Baz
{
private $int = 5;
protected $val = 'value';
public $bol = false;
abstract function name(): string
{
}
public function greet(): void
{
echo "Hello World";
}
}
$main = function (int $argc,string ...$args): void {
$class = new class extends Bar {
public function name(): string {
return 'azjezz';
}
};
$class->greet();
};
$main($_SERVER['argc'], ...$_SERVER['args']);
Expected :
fatal error: Abstract function Foo\Bar::name() cannot contain body in %s on line %d
PHP Behavior :
https://3v4l.org/llHmE
I wanted to call concurrently the php7 parser but as rootnode, comments and positions are defined as php7 module variable it does not work. What I did to solve this issue, was to move those structures to the lexer struct https://github.com/z7zmey/php-parser/blob/master/scanner/lexer.go#L438 and update the yacc parser file and it has worked.
Did I miss something somewhere to have it working with goroutines ?
If not, do you have a different idea to handle such a case or would you consider a pull request ?
What actually works
rootNode, comments, positions := php7.Parse(bytes.NewBufferString(`<? echo "Hello world";`), "example.php")
//How do we get a list of errors easily?
//How do we get the position/column?
//for _, e := range parser.GetErrors() {
// fmt.Println(e)
//}
visitor := visitor.Dumper{
Writer: os.Stdout,
Indent: "",
Comments: comments,
Positions: positions,
}
rootNode.Walk(visitor)
The non-working example given
src := bytes.NewBufferString(`<? echo "Hello world";`)
parser := php7.NewParser(src, "example.php")
parser.Parse()
for _, e := range parser.GetErrors() {
fmt.Println(e)
}
visitor := visitor.Dumper{
Writer: os.Stdout,
Indent: "",
Comments: parser.GetComments(),
Positions: parser.GetPositions(),
}
rootNode := parser.GetRootNode()
rootNode.Walk(visitor)
This is a really interesting project, thanks for working on it. I've run into what looks like an erroneous parse failure with regard to string interpolation code.
<?php
$filename = "something.txt";
@header("Content-Disposition: attachment; filename=\"$filename\"");
This fails due to parse errors of various descriptions, depending on what the surrounding code looks like. When using the php-parser binary, this dumps out
$ php-parser /tmp/brokenparse.php
==> /private/tmp/brokenparse.php
syntax error: unexpected $end, expecting ')'
| [*stmt.StmtList]
| "Stmts":
PHP itself doesn't complain about this code and parses it just fine. Running php -l /tmp/brokenparse.php
results in no errors. I believe this is related to a flaw in the grammar, because if $filename
is followed by a space things work fine:
<?php
$filename = "something.txt";
@header("Content-Disposition: attachment; filename=\"$filename \"");
➜ analyze php-parser /tmp/brokenparse.php
==> /private/tmp/brokenparse.php
| [*stmt.StmtList]
| "Position": Pos{Line: 3-4 Pos: 8-104};
| "Stmts":
| [*stmt.Expression]
... 52 lines elided ...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.