Comments (3)
I fixed it without needing to create temporaries via, in VectorizationBase:
@inline stridedpointer(x::AbstractRange) = x
@inline load(r::AbstractRange, i::Tuple{<:Integer}) = @inbounds r[i[1] + 1]
and in SIMDPirates:
@inline vload(r::AbstractRange{T}, i::Tuple{_MM{W}}) where {W,T} = vmuladd(svrange(Val{W}(), T), step(r), @inbounds r[i[1].i + 1])
@inline vload(r::UnitRange{T}, i::Tuple{_MM{W}}) where {W,T} = vadd(svrange(Val{W}(), T), @inbounds r[i[1].i + 1])
@inline vload(r::AbstractRange{T}, i::Tuple{_MM{W}}, ::Unsigned) where {W,T} = vmuladd(svrange(Val{W}(), T), step(r), @inbounds r[i[1].i + 1])
@inline vload(r::UnitRange{T}, i::Tuple{_MM{W}}, ::Unsigned) where {W,T} = vadd(svrange(Val{W}(), T), @inbounds r[i[1].i + 1])
I haven't commited these changes yet, but will within the next few hours.
This is much slower than not having @avx
, since it prevents LLVM's O(1) optimization of the loop.
Would be neat if we could avoid preventing that optimization.
from loopvectorization.jl.
I have a use-case in mind that wouldn't benefit from O(1)
optimization so I don't mind.
from loopvectorization.jl.
I'm closing this because I've now tagged a release where this works.
julia> f([1,2,3])
6.0
julia> f(1:3)
6.0
Let me know if you have any more problems.
from loopvectorization.jl.
Related Issues (20)
- How to cite? HOT 1
- @turbo can't find index when looping over sparse arrays HOT 3
- Error with accumulation in for loop HOT 4
- Performance issue with Zygote.jl-generated function within `@turbo` HOT 2
- Possible big TTFX regression HOT 1
- Julia 1.9 error: expected Static.StaticInt{1}, got a value of type Static.StaticInt{0} HOT 5
- Julia 1.9 error: @turbo for empty iterator HOT 3
- Trouble understanding @turbo and passing kwargs HOT 4
- `vtrunc(::Float64)` issue HOT 3
- Strange compile behavior for @turbo HOT 2
- is it possible to set @turbo thread = true/false at runtime? HOT 3
- LoopVectorization fail to compile on julia 32bit REPL
- AssertionError: M == 1 HOT 9
- Inconsistent results w/ and w/o @turbo HOT 6
- vfilter with multiple conditions HOT 2
- Memory corruption HOT 2
- Incorrect results using @turbo with linear array indexing HOT 1
- Weird/inconsistent behavior with constant lhs indexing inside @turbo loop HOT 2
- Suboptimal Choice of the Vecotrization Level for Image Convolution HOT 1
- Performance for stride 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from loopvectorization.jl.