php / php-langspec Goto Github PK
View Code? Open in Web Editor NEWPHP Language Specification
Home Page: http://www.php.net
License: Other
PHP Language Specification
Home Page: http://www.php.net
License: Other
https://wiki.php.net/rfc/nullable_types
It seems that nullable types are missing from the spec. Is this being worked on? If not, I'll add it.
In:
Despite the use of the term refcount, conforming implementations are not required to use a reference counting-based implementation for automatic memory management.
Is this statement correct? If I understand correctly many PHP projects depend on the deterministic firing of __destruct()
function to cleanup SQL transactions or connections.
HHVM elaborates on this:
Eliminating destructors. Deterministic object destruction is the reason why nonscalar PHP values require precise reference counting. This requirement has long been, and continues to be, a sizable performance bottleneck in our optimized JIT-compiled code. Using garbage collection instead could unlock measurable performance improvements, and the behavior of destructors could be closely imitated by a combination of try/finally and other new language constructs.
(from http://hhvm.com/blog/2017/09/18/the-future-of-hhvm.html )
echo
is currently listed as an intrinsic expression (https://github.com/php/php-langspec/blob/master/spec/10-expressions.md#echo), but it is a statement.
For example here:
https://github.com/php/php-langspec/blob/master/spec/19-grammar.md#grammar-name-nondigit
"one of the characters U+0080–U+00ff"
This statement is ambiguous if the encoding is not defined. What I believe would be correct is saying that PHP uses ASCII, but byes 0x80 to 0xFF might be allowed, as defined in the grammar, and then the grammar should read "one of the bytes 0x80 – 0xFF"
Hello,
The dereferencable-expression
rule is defined as:
dereferencable-expression:
variable
( expression )
array-creation-expression
string-literal
The string-literal
rule is defined as:
string-literal:
single-quoted-string-literal
double-quoted-string-literal
heredoc-string-literal
nowdoc-string-literal
So all the following expressions are valid:
C::FOO
$c::FOO
($c = 'C')::FOO
'C'::FOO
"C"::FOO
but the following should be valid too:
<<<HN
C
HN;
::FOO
but it's not, of course.
I suggest two proposals. The first one is to update the dereferencable-expression
directly:
dereferencable-expression:
variable
( expression )
array-creation-expression
- string-literal
+ single-quoted-string-literal
+ double-quoted-string-literal
The second one is to add a quoted-string-literal
rule:
+ quoted-string-literal:
+ single-quoted-string-literal
+ double-quoted-string-literal
+
string-literal:
- single-quoted-string-literal
- double-quoted-string-literal
+ quoted-string-literal
heredoc-string-literal
nowdoc-string-literal
dereferencable-expression:
variable
( expression )
array-creation-expression
- string-literal
+ quoted-string-literal
This last proposal is my favorite because this bug can be present somewhere else.
Thought?
Hi,
I am following the specification to implement a PHP lexer.
I noticed that there is a mix of the wording "not case-sensitive" and "case-insensitive".
Would it make sense to just use "not case-sensitive" so that it is more clear?
In my opinion "case-sensitive" and "case-insensitive" is easier to accidentally mix it up.
Examples:
php-langspec/spec/09-lexical-structure.md
Line 417 in ca697b4
php-langspec/spec/06-constants.md
Line 37 in ca697b4
print
should be between and
and yield
, currently it's spec'd as taking an arbitrary expression. https://github.com/php/php-langspec/blob/master/spec/10-expressions.md#grammar-print-intrinsic
Hello there,
I'm not sure wether or not this is the good place to ask*, but I've figured out this:
class A {
private $value = 'foo';
public function and($value) {
return $this->value . ',' . $value;
}
}
$a = new A();
print($a->and('bar'));
PHP7+ result:
foo,bar
PHP previous versions (tested on 5.6):
PHP Parse error: syntax error, unexpected 'and' (T_LOGICAL_AND), expecting identifier (T_STRING)
The ability to use control structures like and/or operators as object methods can be very helpful in creating intuitive fluent interfaces, but this new behaviour in PHP7 is not documented anywhere.
So I was wondering if this is something normal and desired for PHP7+, (then we can safely rely on this to publish PHP projects / libraries), or is this something completely unwanted that may be removed in a further version?
Thank you much,
Ben
* If I'm in the wrong place, please tell me where can I post that topic!
001+ Xdebug: [Config] The setting 'xdebug.default_enable' has been renamed, see the upgrading guide at https://xdebug.org/docs/upgrade_guide#changed-xdebug.default_enable (See: https://xdebug.org/docs/errors#CFG-C-CHANGED)
In the grammar, it appears that the print-expression
production rule is not used. Does it belong in place of yield-expression
in the logical-AND-expression-2
rule?
If I read the grammar correctly, instanceof
is defined as follow:
instanceof-expression:
unary-expression
instanceof-subject instanceof instanceof-type-designator
instanceof-subject:
expression
instanceof-type-designator:
qualified-name
expression
But expression
is defined as:
expression:
yield-expression
include-expression
include-once-expression
require-expression
require-once-expression
So it does not make sense. Maybe instace of expression
we should use primary-expression
(but even not really) or assignment
(or thinner):
instanceof-subject:
- expression
+ assignment
instanceof-type-designator:
qualified-name
- expression
+ assignement
Thoughts?
The clone-expression
is defined as:
clone-expression:
primary-expression
clone primary-expression
I don't understand how to read this. A clone-expression
can be primary-expression
or 'clone' primary-expression
?
Is this a mistake? Should it be the following instead (note the use of recursively using clone-expression
)?
clone-expression:
primary-expression
clone clone-expression
Differences in format letters (specifically "v") make the DateTime format constants unusable for createFromFormat.
Example:
DATE_RFC3339_EXTENDED
has the value "Y-m-d\TH:i:s.vP"
.
This predefined format will work for date()
, but not for DateTime::createFromFormat
, as milliseconds require the format symbol "v".
createFromFormat should support "v" as milliseconds symbol to support the DateTime format constants.
Closing the issue, because it is already addressed here https://bugs.php.net/bug.php?id=76009
It is impossible to claim conformance to an ever evolving language specification without referring to a certain version/edition or whatever it will be called. There are open pull request and issues which cannot be sensibly addressed due to this issue, unless we're claiming this spec is still a draft, in which case it is not really helpful for alternative implementations at all.
In the grammar, the string-variable
production rule is not used.
Looking at https://github.com/php/php-src/blob/0720313bd452adf451173574e97fd761f90623a2/Zend/zend_language_parser.y#L1223, I think it should be a summand of dq-char
.
Test body
So far, the coalesce-expression
is defined as:
coalesce-expression:
logical-inc-OR-expression-1 ?? expression
So if I read the grammar correctly, it means we cannot write this 1 ?? 2 ?? 3
.
Maybe we would like to use assignment-expression
instead of expression
, is it correct?
coalesce-expression:
- logical-inc-OR-expression-1 ?? expression
+ logical-inc-OR-expression-1 ?? assignment-expression
It might be, that I'm missing something, because formal language definition is somewhat new to me, but right now it seems to me, that a statement can never generate a simple expression like $v = 1
.
A statement can generate an expression statement, which can generate an expression.
expression-statement: expressionopt ;
But I can't find any way to generate a primary-expression or an assignment-expression out of an expression. When searching the repository for the terms, I can only find their definitions, not any usages of them: repository search for 'primary-expression'
What am I missing here?
Hi, I think it is perhaps better to document the list-expression
in this way, which automatically ensures there is at least one valid element:
list-intrinsic:
list ( list-expression-list )
list-expression-list:
commas? list-element commas?
commas? list-element , list-expression-list?
commas:
,
, commas
list-element
expression => list-or-variable
list-or-variable
Then we can remove the following constraints documentation:
At least one of the elements of the list-expression-list must be non-empty.
I think this makes the document a bit neater and parser can write the list parser just by reading the expression.
I can submit a PR if you think this is a good change.
See, for instance, list_007.phpt:
php-langspec/tests/expressions/list/list_007.phpt
Lines 1 to 15 in 7d35063
the right-hand operand must be an expression that designates an array or object implementing the ArrayAccess interface
Furthermore, the test uses var_dump
which is only mentioned, but not specified in the langspec (particularly it's output format may differ).
If the tests should be useful for language implementations other than the php.net implementation, they should be written with portability in mind. Otherwise we could as well get rid of the test suite altogether (and merge possibly missing tests to the php.net test suite).
Added in 7.1: https://wiki.php.net/rfc/short_list_syntax
Actual:
foreach-value:
&<sub>opt</sub> expression
list-intrinsic
Expected:
foreach-value:
&<sub>opt</sub> expression
list-intrinsic
[ list-expression-list ]
In https://github.com/php/php-langspec/blob/master/spec/11-statements.md#labeled-statements: A labeled statement is not required to be followed by a statement (unlike in some other languages). It is considered a statement in its own right. For example, this is valid:
function foo() {
goto end;
end:
}
Also the grammar contains a typo :
to ;
.
The below code is syntactically valid, and needed to indicate which implementation of foo
to inherit
<?php
class C{
use T1,T2,T3{T1::foo insteadof T2,T3;}
}
Currently, the specification says that it accepts a single name. https://github.com/php/php-langspec/blob/master/spec/16-traits.md#trait-uses
trait-select-insteadof-clause: name insteadof name
Should that be name 'insteadof' trait-name-list
? Also, I'm not familiar with this project or why it's name
instead of qualified-name
Noticed when working on microsoft/tolerant-php-parser#190
In php 7.3.0, isset is legal for the following code:
isset($a,);
However, the isset variable list is defined by
isset-intrinsic:
isset ( variable-list )
variable-list:
variable
variable-list , variable
in 10-expression.md
.
If we plan to make the spec consistent with current PHP, I can submit a PR to change it as
isset-intrinsic:
isset ( isset-variable-list ','? )
variable-list:
variable
variable-list , variable
From microsoft/tolerant-php-parser#19: instanceof should have higher precedence than !
.
New constants have been added since 7.2.0:
Actual: https://github.com/php/php-langspec/blob/master/spec/06-constants.md
(Somewhat) Expected: http://php.net/manual/en/reserved.constants.php
The following changes may matter for the language specification in PHP 8.0 - feel free to split out separate tickets
https://wiki.php.net/rfc/named_params
https://wiki.php.net/rfc/namespaced_names_as_token
https://wiki.php.net/rfc/nullsafe_operator
https://wiki.php.net/rfc/match_expression_v2
https://wiki.php.net/rfc/non-capturing_catches
Trailing commas in parameters/closure use
https://wiki.php.net/rfc/union_types_v2
https://wiki.php.net/rfc/static_return_type
https://wiki.php.net/rfc/mixed_type_v2
Additionally, the php 8.0 attributes syntax is still being voted on
Split off from #208:
Currently array-element-initializer specifies that the key and value are both ordinary expressions. As specified, this is ambiguous with yield-expression.
PHP resolves this as follows:
[yield "foo" => "bar"]
// is
[(yield "foo" => "bar")]
// rather than
[(yield "foo") => "bar"]
However, I'm not sure how this can be specified in grammar form.
Hi, it seems use function
statement is not tested in tests/namespaces
.
In PHP 7.3.0, the following code is legal:
$a = [,];
And the following code is legal in parser but illegal during runtime:
$a = [,,];
However, current array-initializer
in 10-expressions.md
and 19-grammar.md
does not include ,
as an element.
If we want to describe this behavior in the langspec, then we need to change the definition of array-initializer
to
array-initializer:
array-initializer-list ','?
','
I can submit a PR if you think it is good to support this current behavior.
In note 5. about array comparisons the following is stated:
If the next key in the left-hand operand does not exist in the right-hand operand, the arrays cannot be compared and
FALSE
is returned.
This is actually not true in case of the spaceship operator on both PHP and HHVM.
<?php
$lhs = [0 => 0];
$rhs = [1 => 1];
var_dump(
$lhs < $rhs,
$lhs <= $rhs,
$lhs <=> $rhs,
$lhs >= $rhs,
$lhs > $rhs
);
/*
bool(false)
bool(false)
int(1)
bool(false)
bool(false)
*/
Actual:
const-elements:
const-element
const-elements const-element
Expected:
const-elements:
const-element
const-elements , const-element
https://wiki.php.net/rfc/throw_expression
Observed: Statements include throw
Expected: Expressions include throw
$x instanceof $y instanceof $z
is a nonsense expression, but is successfully parsed in PHP 5.6-7.3 (see https://3v4l.org/QWYZJ)
2 == 3 != 4
)Either of the following would make sense:
Mention that this is non-associative in php-langspec (to reflect php.net) (It seems like that's probably a consequence of the current CFG, so maybe nothing needs to be done)
Additionally, try to make that a syntax error in php 8.x or 7.4, to reflect the documentation on php.net: https://secure.php.net/manual/en/language.operators.precedence.php
Update the specification to allow it, and update php.net (it seems like it's currently parsed as ($x instanceof $y) instanceof $z
according to ast\parse_code()
) - I'm opposed to that, and it seems like the spec in this repo forbids that
I'm not quite sure what repo to file this in, or the process to request the change to php implementation's syntax, or if this has been mentioned elsewhere.
The specification currently states that use declarations bring a name into a scope, but that's not exactly true, as imports are not available prior to the use declaration.
namespace A {
class C {}
}
namespace B {
// Same scope, 'A\C' should be available right?!
$c = new C();
use A\C;
}
Without this restriction, an engine could theoretically make an initial pass to import names from all use declarations, and therefore prevent a name resolution error from occurring.
18-namespaces.md
A namespace-use-declaration imports — that is, makes available — one or more names into a scope, optionally giving them each an alias.
Support for multiple catch types was added to PHP 7.1 as per https://wiki.php.net/rfc/multiple-catch.
php/php-src#1796 is the PR that merged this feature into php master.
This makes the following code valid in PHP 7.1:
try {
// Some code...
} catch (ExceptionType1 | ExceptionType2 $e) {
// Code to handle the exception
}
I'm new to this project (and parsers in general) but from what I can tell the grammar for the try statement should be updated to allow for multiple, pipe-separated qualified-name
items.
I think the following grammar is correct:
try-statement:
'try' compound-statement catch-clauses
'try' compound-statement finally-clause
'try' compound-statement catch-clauses finally-clause
catch-clauses:
catch-clause
catch-clauses catch-clause
catch-clause:
'catch' '(' catch-name-list variable-name ')' compound-statement
catch-name-list:
qualified-name
catch-name-list '|' qualified-name
finally-clause:
'finally' compound-statement
Can some code be refactored here? There are some tests that the ternary and null coalesce operator fits 😄
I would like to propose adding CUrl classes to PHP 8. Currently we do not have classes apart from CURLFile
class. I think that native implementation of curl in OOP would be a very good idea.
I have written PoC of what this class could look like. https://github.com/kamil-tekiela/curl
Native implementation would benefit from:
The specification mentioned that if the overriding method is not compatible with the overridden method, a non-fatal error will be issued. But for PHP 8 <=
the error is fatal.
Also the rules for properties overriding are not mentioned.
Thanks for the great work.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.