Comments (18)
To be fair, my points have no geojson properties
, but I still think this is odd and maybe bad.
from turfpy.
Thanks, @zsiegel92 for reporting this. We will surely look into this. Meanwhile, if you want to raise PR then feel free to raise it.
:)
from turfpy.
@zsiegel92 - Apart from finding Point in single Polygon currently we also support multiple Polygons and MultiPolygons, but yes we will need to improve the performance of it.
from turfpy.
@sackh Thanks for this response! I'm glad you're looking into it. When I encounter a MultiPolygon
called feature
, I just iterate through the coordinate lists in feature.geometry.coordinates
, create a Polygon
for each list of coordinate, and then use the method above. Aside from the object-copying I mentioned, it's basically an O(1) transformation, so the difference should remain, though I haven't done a timed test. I can't imagine turfpy
does anything much different... I can't imagine there's any theoretical MultiPolygon.contains
algorithm that is so much more clever that it is worth a 4x slowdown for a Polygon
...
from turfpy.
@zsiegel92 can you share the geojson for the points and polygon you test with I have made some progress so to test it I can use that geojson data.
from turfpy.
@zsiegel92 I have merged the changes and test with some huge data it looks good, made a new release v0.0.5 you might wanna give it a try 😉
from turfpy.
Hi @omanges I look forward to trying out the new implementation! I already have a parallel set of functions that uses this method rather than my workaround, so I should be able to test it out soon.
This will simplify my codebase, as I will not have any reason to import shapely
if this method is comparably performant!
from turfpy.
@omanges I just ran a small and a large trial of my current use case and got the following output:
Using turfpy.measurement.points_within_polygon
:
595 points:
Ran 15 trials with average time of 2.7013930002848308 seconds!
14772 points
Ran 5 trials with average time of 57.62896718978882 seconds!
Using Shapely
workaround:
595 points:
Ran 15 trials with average time of 1.3027364571889242 seconds!
14772 points
Ran 5 trials with average time of 31.952202796936035 seconds!
from turfpy.
So: it looks like your update improved the performance of this method by a lot! It's still a bit slower than using the shapely
method, for some reason.
I noticed when I updated turfpy
, pip
mentioned downloading shapely
as a dependency...was this always a dependency, or did your update utilize Shapely?
from turfpy.
@zsiegel92 Actually, we had added shapely few version earlier, but it is used for some other functions, I have one question did you increased the chunk size parameter, I bet it will give you more after results than shapely :) Give it a try.
from turfpy.
@zsiegel92 By default the chunk_size is 1, but can you try by setting it to 9?
Following is the new signature of the function.
def points_within_polygon(
points: Union[Feature, FeatureCollection],
polygons: Union[Feature, FeatureCollection],
chunk_size: int = 1,
) -> FeatureCollection:
from turfpy.
Definitely an improvement as I increase chunk size! I increased it between 1 and 9, and results improved. What is the maximum chunk size at which you expect to see an improvement?
param={'shapelyStyle': True, 'chunk_size': 1}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': True, 'chunk_size': 1}: 7.476626777648926
param={'shapelyStyle': False, 'chunk_size': 1}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 1}: 23.191674423217773
param={'shapelyStyle': False, 'chunk_size': 2}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 2}: 19.63445348739624
param={'shapelyStyle': False, 'chunk_size': 3}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 3}: 15.31570062637329
param={'shapelyStyle': False, 'chunk_size': 4}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 4}: 14.534931993484497
param={'shapelyStyle': False, 'chunk_size': 5}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 5}: 13.989737749099731
param={'shapelyStyle': False, 'chunk_size': 6}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 6}: 13.749914312362671
param={'shapelyStyle': False, 'chunk_size': 7}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 7}: 13.314592599868774
param={'shapelyStyle': False, 'chunk_size': 8}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 8}: 13.24393539428711
param={'shapelyStyle': False, 'chunk_size': 9}
Iteration 1/5 Iteration 2/5 Iteration 3/5 Iteration 4/5 Iteration 5/5 Average for {'shapelyStyle': False, 'chunk_size': 9}: 13.031423473358155
from turfpy.
Sorry that's tough to understand - the first test was using the Shapely variant I wrote, with average time 7.48s (chunk_size
is written as 1 but is not meaningful for this trial). The subsequent trials used the Turfpy function; with chunk_size
1, the time was 23.19s, and it decreased all the way down to 13.03s with chunk_size
9.
from turfpy.
@omanges I ran a few trials overnight and noticed a roughly constant performance for chunk_size
s between 30 and 150. It leveled out at taking around 1.5x the time that Shapely took. Without knowing what this means, it's hard to know what the ideal value should be.
from turfpy.
@zsiegel92 the chunk_size parameter means same as the chunk_size present in multiprocessor.map please refer to this document https://docs.python.org/release/2.6.6/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.map
from turfpy.
@omanges Thanks! Now I've tested for chunk_size
s of 100, 500, 1000, 10000, and 15000, all with similar performance (12-16s per trial, with <7s for the Shapely version).
from turfpy.
@zsiegel92 I think further improvement can be done by improving the turfpy.measurement.boolean_point_in_polygon, and then it works faster than shapely as well.
from turfpy.
And really Thank You !!! for doing such a great investigation :)
from turfpy.
Related Issues (20)
- Trouble with boolean_point_in_polygon() on irregular polygons. HOT 1
- Trouble with transformation circle
- incorrect measurement due to swap of lat/long
- Match documentation to implementation. If distance is expection (long, latt) specify in the doc string as such.
- precision not honored HOT 1
- Implement Buffer HOT 1
- transform translate cannot be called with a 0 direction (North)
- can i use turfpy.transformation.union to merge a list of lines?
- Intersect() does not always preserve order of coordinates HOT 3
- Add new functionality: lineIntersect() HOT 3
- question, not an issue: what are the benefits of reimplementing turf.js vs using shapely/geopandas? HOT 5
- Add new functionality: lineSlice() HOT 2
- turfpy.misc.line_intersect is not working correctly HOT 1
- line_segment() doesn't return all segments of MultiLineString
- turfpy.transformation union function precision
- does not work on derived classes? HOT 1
- nearest_point always returns first feature in collection? HOT 1
- The Return of "does not work on derived classes?" HOT 3
- Add support for turf.sector HOT 3
- TurfPy Concave throws ZeroDivisionError: float division by zero for concave hull HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from turfpy.