Comments (22)
For the crazy idea bin:
What if Tile
s were encoded as ND4J matrices, and Map Algebra ops were rewritten using its native operations, which can be executed in CPUs or GPUs, via the BLAS/LAPACK backend of your choice.
See also:
from geotrellis.
@fosskers this is your jam now ;)
from geotrellis.
Yeeeaaaaahhhhhhhhhhh 😎 (music from CSI)
from geotrellis.
Related and additional discussion here: #1789
from geotrellis.
Awwww snap, the 🐉 rises!
from geotrellis.
The first thing to do would be to use JHM to compare our IntArrayTile
subtype with a Tile[Int]
and see how they compare. Doing so on Scala 2.12 might be best, even. We don't yet publish 2.12 artifacts because of Spark compat, but raster
on its own should be able to be publishLocal
'd with 2.12.
The Tile[A]
should be marked @specialized
for the usual number types and use whatever help spire
can offer.
Assuming Tile[A]
would gain us something (the dismantling of the Tile hierarchy, no more map
/mapDouble
, etc), @lossyrob @echeipesh is there any percentage of slowdown that would be acceptable as the cost of that abstraction? In a perfect world all this @specialize
and newtype
ing will avoid boxing altogether, but who knows what'll actually happen. 5%? Unlikely to be that good. 50%? Way too slow, not worth it. 10%? Nice, still probably not likely. 25%? Users might be sad. 15%? Just right?
from geotrellis.
I'd also be interested in understanding how much of a performance benefit arose from the macro-generated aspects of GeoTrellis, and whether this is or isn't a requirement for optimal performance.
from geotrellis.
The thing the macros get around is the fact that FunctionN where N > 2 is not specialized. The macro generated methods prevent boxing while allowing an API that still allows for lambdas over things like map
and mapDouble
.
There was a lot of benchmarking to make sure this was the case.
from geotrellis.
@fosskers because we're already hamstringed with being on the slow JVM, and we are a performance oriented library, we've from the beginning done a lot of work to eek out the most performance possible. So to me, 15% slowdown is unacceptable. Even 5% would hurt. I spent hours upon hours microbenchmarking and tweaking focal operations when I refactored them to make sure there wasn't any slowdown at all from a previous version. Tile performance remains to me a very core concern of GeoTrellis.
The thing is here, if you end up boxing, you're not going to see some small percentage difference - you are going to see a ton of slowdown. So I think it's an all or nothing thing - if you figure out how to do it and completely avoid boxing, I don't see why you would have to slow things down at all.
from geotrellis.
Tile performance remains to me a very core concern of GeoTrellis.
Gotcha, thanks for being open about those priorities.
from geotrellis.
I can't seem to find the benchmarks in question. The geotrellis-benchmark
project looks mostly empty now since many of the benchmarks stopped compiling.
from geotrellis.
Which benchmarks? The macro ones, I'm not sure they survived forward. Happy to have people double check and make some ones with more longevity if someone is up for it. But if the macros don't speed things up in a .map { (col, row, z) => ??? }
case, I will eat all the hats :)
The focal benchmarking also didn't survive. But I found it here: https://github.com/locationtech/geotrellis/blob/_old/v0.8.0/benchmark/src/main/scala/geotrellis/benchmark/FocalBenchmarks.scala
While perusing benchmarks found this: https://github.com/geotrellis/geotrellis-benchmark/blob/master/geotrellis-0.10/src/test/scala/geotrellis/raster/GenericRaster.scala which is a good indicator of boxing slowdown.
from geotrellis.
Awesome, we should revive those into the bench
subpackage here.
Oh cool, and GRaster
is probably a good thing to test @specialize
against.
from geotrellis.
For the reference bin:
https://github.com/alexknvl/newtypes
from geotrellis.
Feel free to create an issue and assign it to me if you want specific benchmarks ported (just not all of them at once).
from geotrellis.
@metasim cool link, hope we'll see Spark on 2.12 at the end of this year.
from geotrellis.
What's the difference between the last two lines?
from geotrellis.
@metasim the ND4J idea is a good one. Some benchmarks around that be really interesting. I'm often asked if GeoTrellis can take advantage of GPUs; using LAPACK has always been in the back of my mind for that usage. If it turns out that ND4J beats our ArrayTiles hands down, and we can make a deployment story that isn't too painful, I'd say we should start thinking about putting a migration on the roadmap.
from geotrellis.
Alternative to https://github.com/alexknvl/newtypes with better ergonomics (IMHO):
https://github.com/fthomas/refined
from geotrellis.
Cool, that's not one I've tried yet. I'll give it a spin next Monday.
from geotrellis.
From refined
:
Using refined's macros for compile-time refinement has zero runtime overhead for reference types and only causes boxing for value types.
from geotrellis.
^ This PR achieves "Tile[T]
with caveats", as explained in that PR.
from geotrellis.
Related Issues (20)
- How to init TileLayerRDD with a piece of extent HOT 2
- AWS S3 RequestPayer env variable HOT 1
- Mask/localMask converts celltype
- Get wrong crs from tif file with coordinates ESRI:54009 HOT 3
- FileCOGLayerWriter.write from SparkCOGExamples issue HOT 1
- Geotiff Tags not fully compatible with GDAL HOT 1
- allow setting tiff SoftwareTag, DocumentNameTag, ImageDescTag
- There seems to be a bug in crs HOT 7
- not found: type Serializable compiler bug HOT 1
- Error occurred when using Etl to slice the grid tiff file in hdfs HOT 6
- IndexOutOfBoundsException occurs when reading tiff
- hadoopMultibandGeoTiffRDD throwed an error on the remote spark cluster : java. lang. ClassCastException HOT 9
- How to Index Pyramid Tiles with Custom Encoding HOT 3
- Upgrading to GeoTools 30.x series, refactor to org.geotools.api interfaces HOT 10
- Reading a TIFF and then writing it back may loose BitsPerSample information
- java.io.InvalidClassException: geotrellis.layer.TileLayerMetadata; local class incompatible: stream classdesc serialVersionUID = 3142813742075090433, local class serialVersionUID = -468075711590230574 HOT 3
- Drop JDK8 support HOT 1
- convert NODATA ByteConstantTile: unexpected result HOT 2
- File size larger than expected when using GeoTiff().writer to output Tif file HOT 2
- Unable to get attributeTable param when accessing a GeoTrellis layer stored in HBase via URI HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from geotrellis.