Comments (28)
Indeed, it would be really okay if @default_eltype
was just public API so that I could just use it as a fallback with confidence when eltype
returns Any
which can be also mentioned in the docstring:
infer_eltype(itr) = (T = eltype(itr)) === Any ? @default_eltype(itr) : T
from julia.
We might want to special case Union{}
because sometimes type inference can infer that the itr is empty, in which case it tells us Union{}
which is true and precise and not usually what folks are looking for
julia> Base.@default_eltype Iterators.Filter([1,2,3]) do x
sum([1]) == 0
end
Int64
julia> Base.@default_eltype Iterators.Filter([1,2,3]) do x
sum(1) == 0
end
Union{}
from julia.
xref #54207
from julia.
Another possibility is to create a new function as above and export it or even something maybe a bit stronger as
infer_eltype(itr) = ifelse((T2 = Base.@default_eltype(itr)) <: (T1 = eltype(itr)), T2, T1)
with all the needed disclaimers in the docstring
from julia.
If we are going the route where we export a new function, which to me seems preferable, I (maybe mis)understood what @LilithHafner has suggested to be implemented as for example
function infer_eltype(itr)
T1, T2 = eltype(itr), Base.@default_eltype(itr)
ifelse(T2 !== Union{} && T2 <: T1, T2, T1)
end
from julia.
when it is possible to prove that an iterator is empty, a really good type inference algorithm will produce Union{}, which is the narrowest type.<...> We happen to be running into this problem now as type inference is good enough to make these emptiness proofs.
This is great – inference is improving!
In the past some folks (including Base) have at times relied on compiler limitations by expecting type inference not to return Union{} here.
And this seems to be the actual issue.
It's just really weird to make the inferred type artificially wider, as if the inference was less precise.
from julia.
Which raises the question: when would someone use this
infer_eltype
function?
I would say at least whoever is using Base.@default_eltype
currently, actually probably more because it is not even exported now
from julia.
Seems like around 20 packages some of which very popular are using it: https://juliahub.com/ui/Search?q=@default_eltype&type=code
from julia.
IMO a pull request adding infer_eltype
(with appropriate docs) should be accepted.
from julia.
It was a recent performance improvement for collect, which is designed not to return random Union types normally (only having trouble with the empty case). The exact PR for it should be findable by git blame if interested in the full picture
from julia.
As far as I remember, Base.@default_eltype
relies on type inference. Type inference works on a best-effort basis, so it's good for optimizations in some cases, but mustn't be relied on for correctness. So the change you're proposing wouldn't be correct.
Apart from that, the run time cost of type inference wouldn't be acceptable for eltype
.
from julia.
Consider that though collect(::Base.Generator)
relies on Base.@default_eltype
to infer the type for the array. So it would be logical in my opinion to use it for eltype
.
Also I see very good performance using it on some micro-benchmarks:
julia> using BenchmarkTools
julia> iter = (x for x in 1:10)
Base.Generator{UnitRange{Int64}, typeof(identity)}(identity, 1:10)
julia> @benchmark eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.182 ns … 6.061 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.192 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.197 ns ± 0.056 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂ █▆ ▄▃ ▁ ▃▂ ▁ ▁
█▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁█ █
1.18 ns Histogram: log(frequency) by time 1.24 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark Base.@default_eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.182 ns … 6.843 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.192 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.202 ns ± 0.085 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▃▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▄▃▁▁▁▁▁▁▁▁▃ ▂
1.18 ns Histogram: frequency by time 1.24 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> iter = (x < 5 ? x : 2.0*x for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#1#2"}(var"#1#2"(), 1:10)
julia> @benchmark eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.182 ns … 22.753 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.192 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.203 ns ± 0.218 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▃▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▄▃▁▁▁▁▁▁▁▁▃ ▂
1.18 ns Histogram: frequency by time 1.24 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> @benchmark Base.@default_eltype($iter)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 1.182 ns … 6.622 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.193 ns ┊ GC (median): 0.00%
Time (mean ± σ): 1.209 ns ± 0.073 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█
▂▁▁▁▁▁▁▁▁█▆▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▃▂▁▁▁▁▁▁▁▇▄▁▁▁▁▁▁▁▁▅ ▂
1.18 ns Histogram: frequency by time 1.24 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
from julia.
A thing I don't understand is why in Base.@default_eltype
(https://github.com/JuliaLang/julia/blob/master/base/array.jl#L744) it is used promote_typejoin_union(T)
, I would prefer to infer a Union
of concrete types than an abstract type e.g.
julia> s = (x < 5 ? x : 2.0*x for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#5#6"}(var"#5#6"(), 1:10)
julia> T = Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(s)})
Union{Float64, Int64}
julia> Base.promote_typejoin_union(T)
Real
Why can't we use Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(s)})
at least when the Union
is not too big? Notice that this affects also collect
:
julia> collect(s)
10-element Vector{Real}:
...
from julia.
collect does not return a Union, as it gets the types from the real values and not from inference
from julia.
right, now I see, it widens the type. Nonetheless I guess with an infer_eltype
function which returns also Union
s one can probably take advantage of this with collect(infer_eltype(itr), itr)
from julia.
@vtjnash this is a bit derailing the conversation, but I'd like to ask you: is it a bad idea to return a Union
(of not too many types)? e.g. I was able to return a Vector{Union{Int, Float64}}
in the above case by changing https://github.com/JuliaLang/julia/blob/master/base/array.jl#L822 in new = similar(dest, Union{T, typeof(el)})
.
from julia.
Yes. That will never be the type of the object when it actually contains data, so it slows down the case of when there is data to benefit the case where there is not data
from julia.
You typically want the static and dynamic types to nearly coincide, for optimal results
from julia.
We might want to special case Union{} because sometimes type inference can infer that the itr is empty, in which case it tells us Union{} which is true and precise and not usually what folks are looking for
It would be really unfortunate, especially if this avoidance of Union{}
is motivated solely by current compiler limitations (and not language design concerns).
Do you suggest turning Union{}
into Any
, or something else? There's just no other type that would retain the nice invariant that the (inferred) eltype of concatenation of A + B is a supertype of A and B eltypes.
from julia.
It would be really unfortunate, especially if this avoidance of Union{} is motivated solely by current compiler limitations (and not language design concerns).
Inference is tasked with finding the narrowest type it can come up with that is a supertype of all types of instances that may actually occur. Coming up with a supertype of all types that could actually occur is a correctness thing: it has to get that right every time or we have compiler bugs. Picking a narrow type is on a best-effort basis. Returns(Any)
is a perfectly valid type inference algorithm for Julia, just not a particularly performant one. Consequently a, when it is possible to prove that an iterator is empty, a really good type inference algorithm will produce Union{}
, which is the narrowest type. Union{}
is not a supertype of any concrete type, but nevertheless it is (vacuously) a supertype of all types of instances that may actually occur in the (empty) iterator. This is all at the language level. We happen to be running into this problem now as type inference is good enough to make these emptiness proofs. In the past some folks (including Base) have at times relied on compiler limitations by expecting type inference not to return Union{} here.
Do you suggest turning Union{} into Any, or something else? There's just no other type that would retain the nice invariant that the (inferred) eltype of concatenation of A + B is a supertype of A and B eltypes.
We cannot guarantee that in any case so long as we rely on inference because inference may decide to provide an arbitrarily wide eltype for A and/or B. Even turning Union{}
into Any
would not help. For example if A
infers to Union{}
and B
infers to Int
, it is quite plausible for their concatenation to infer to Int
, which is not a supertype of Any
.
from julia.
if we want to return the narrowest type possible would then be fine to use (a macro version of) Core.Compiler.return_type
instead of Base.@default_eltype(itr)
so that to infer Union
types in some cases instead of abstract ones?
from julia.
And this seems to be the actual issue.
It's just really weird to make the inferred type artificially wider, as if the inference was less precise.
Which raises the question: when would someone use this infer_eltype
function?
from julia.
Yeah, for example
function _DisjointSet(xs, ::Base.EltypeUnknown)
T = Base.@default_eltype(xs)
(isconcretetype(T) || T === Union{}) || return Base.grow_to!(DisjointSet{T}(), xs)
return DisjointSet{T}(xs)
end
to
function _DisjointSet(xs, ::Base.EltypeUnknown)
T = infer_eltype(xs)
isconcretetype(T) || return Base.grow_to!(DisjointSet{T}(), xs
return DisjointSet{T}(xs)
end
has one semantic change: when inference can infer that the iterator is empty you get at DisjointSet{eltype(iter)}()
instead of a DisjointSet{Union{}}()
.
(source)
from julia.
I could try to make a PR but I'm still unsure about a thing, consider:
julia> struct A end
julia> struct B end
julia> itr = (x < 5 ? A() : B() for x in 1:10)
Base.Generator{UnitRange{Int64}, var"#1#2"}(var"#1#2"(), 1:10)
julia> Base.@default_eltype(itr)
Any
julia> Core.Compiler.return_type(Base._iterator_upper_bound, Tuple{typeof(itr)})
Union{A, B}
I would really prefer the second one, so I think a possible infer_eltype
function should not use Base.@default_eltype
but a narrower version of it (without promote_typejoin_union
at the end), is that okay?
from julia.
ok, maybe we could actually have two versions, with a tight/narrow = false
argument so that the user can choose
from julia.
Sure, IIUC that promote_typejoin_union
dates back to when the compiler was bad at handling Unions performantly, so it would be reasonable to remove. No need for an option, the user can always call promote_typejoin_union
on the result. It's also possible to continue this discussion once you open a PR (most PRs have a bit of discussion and revision before merging).
from julia.
The PR that changed @default_eltype
was #42046.
collect
has been returning abstract types instead of unions since at least 1.0.
from julia.
Yes, looks like the main explanation of that PR is in #30485 though
from julia.
Related Issues (20)
- LLVM Error out of memory on CI on32-bit GNU linux `compileall`(?)
- Keyword arguments emitted by macros don't get gensym'd
- Interest for an `Iterators.nth(x, n)` API? HOT 4
- `iterate` returns tuple for non-existent state HOT 6
- Should n-arg fastmath addition default to using to 2-arg fastmath addition for all types HOT 1
- suboptimal return type inference for `Base.setindex`
- Stdlib pkgimages seems to be compiled with generic target HOT 2
- Broadcast with missing destroys sparsity
- eltype of Symmetric or Hermitian matrix of matrices is AbstractMatrix HOT 9
- Keyword arguments can be placed anywhere in a function call HOT 1
- Julia's target parsing/detecting deserves a refactor HOT 1
- `help` in normal mode REPL acts strange HOT 1
- `checked_rem` and `checked_mod` don't follow the same promotion rules as regular rem and mod
- `@doc` (`MD` object) doesn't print with horizontal bars between methods, which can mislead the reader.
- support syntax `macro OtherPackage.theirmacro(args...) ... end`
- Feature request: Non allocating sort for tuple HOT 1
- suboptimal return type inference for `reverse(::Tuple)` on v1.11 and master
- Multi index `Base.setindex` HOT 1
- Remove `Dates.value(::Date)`
- `escape_string` doesn't appear to be able to keep double-quotes. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from julia.