ironholds / geohash Goto Github PK
View Code? Open in Web Editor NEWGeohash generation, decoding and manipulation in R
License: Other
Geohash generation, decoding and manipulation in R
License: Other
Below is a command sequence:
gh_encode(17.37213,5.37213,6)
[1] "s55ff4"
gh_decode("s55ff4")
lat lng lat_error lng_error
1 17.37213 17.37213 0.002746582 0.005493164
I even tried with a few more geohashes but the lat-lng values were always same. I even found this in the examples quoted everywhere in the package description.
Even in the examples provided in the package, gh_decode()
appears to give the wrong result for longitude:
gh_encode(lat = 42.60498046875, lng = -5.60302734375, precision = 5)
#[1] "ezs42"
gh_decode("ezs42")
# lat lng lat_error lng_error
#42.60498 42.60498 0.02197266 0.02197266
Here, lng
should read -5.60303
unless I've made a fundamental misunderstanding of the package.
I get the following error when trying to install the latest github version of the package:
> devtools::install_github("ironholds/geohash")
Using GitHub PAT from envvar GITHUB_PAT
Downloading GitHub repo ironholds/geohash@master
from URL https://api.github.com/repos/ironholds/geohash/zipball/master
Installing geohash
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save \
--no-restore --quiet CMD INSTALL \
'/private/var/folders/wl/klf5ry4n2vg4lfmtqk5tv5sr0000gn/T/RtmpeIBg1F/devtools822d14b299f7/Ironholds-geohash-8bc5c14' \
--library='/Library/Frameworks/R.framework/Versions/3.4/Resources/library' \
--install-tests
* installing *source* package ‘geohash’ ...
** libs
/usr/local/clang4/bin/clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I"/Library/Frameworks/R.framework/ Versions/3.4/Resources/library/Rcpp/include" -I/usr/local/opt/gettext/include -I/usr/local/opt/llvm/include -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include/Rcpp.h:27:
In file included from /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include/RcppCommon.h:38:
In file included from /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include/Rcpp/r/headers.h:48:
In file included from /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include/Rcpp/platform/compiler.h:100:
In file included from /usr/local/clang4/bin/../include/c++/v1/cmath:305:
/usr/local/clang4/bin/../include/c++/v1/math.h:301:15: fatal error: 'math.h' file not found
#include_next <math.h>
^~~~~~~~
1 error generated.
make: *** [RcppExports.o] Error 1
ERROR: compilation failed for package ‘geohash’
* removing ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/geohash’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/geohash’
Installation failed: Command failed (1)
Possibly related: http://www.mjdenny.com/Rcpp_Intro.html mentions that "the math.h and cmath headers should not be included in C++ code that will be part of an R package. My understanding of this is that these are very commonly used libraries, and that including one in one of your C++ files may alter the functionality of other libraries"
in the example below, i attempt to get a 20 character geohash. the error states that the default value is 6; however, as shown by the second call) no default is actually set in the r function. Asking for a negative hash length also returns a value.
geohash::gh_encode(1, 1, 20)
#> Warning in geohash::gh_encode(1, 1, 20): Precision must be between 1 and
#> 10. Default of 6 used.
#> [1] "s00twy"
geohash::gh_encode(1, 1)
#> Error in geohash::gh_encode(1, 1): argument "precision" is missing, with no default
geohash::gh_encode(1, 1, -1)
#> Warning in geohash::gh_encode(1, 1, -1): Precision must be between 1 and
#> 10. Default of 6 used.
#> [1] "s00twy"
it seems like if you are going to provide a default value for the length in some cases, it should be a default in the function call, particularly since the bounds of 1 and 10 aren't nor the default are in the gn_encode()
docs. Also, asking for negative values should probably throw an error since the user asked for something nonsensical.
was going to submit a PR but then realized the function is also vectorized over precision
so i wasn't sure what you thought about handling a mix of values. here is my proposal:
precision >=10
return length of 10 & `warning'precision <1
return NA_character_
thoughts?
neighbors that should be on the other side of the 180|-180 divide are not being properly handled. in this example the east
neighbor is the same as the initial geohash for the first row and similarly the west
neighbor in the second row (and then the other dupiclates in the return from the gh_neighbors
call as well).
hashes <- c("xzrbx","8p208")
geohash::gh_decode(hashes) #either side of intl date line
#> lat lng lat_error lng_error
#> 1 40.89111 179.978 0.02197266 0.02197266
#> 2 40.89111 -179.978 0.02197266 0.02197266
geohash::gh_neighbours(hashes)
#> north northeast east southeast south southwest west northwest
#> 1 xzrbz xzrbz xzrbx xzrbr xzrbr xzrbq xzrbw xzrby
#> 2 8p20b 8p20c 8p209 8p203 8p202 8p202 8p208 8p20b
gh_decode("wx4gfbe")
lat lng lat_error lng_error
1 40.03761 116.4928 0.0006866455 0.0006866455
lng has only 4 digital. Can we get more precise result?
I've got starting point and stopping point geohashes that I aim to convert to lat/lon pairs for numeric analysis.
Wouldn't it be easy to (a) assign the input geohashes as row names (would by necessity be an option since duplicate row names are not allowed) or (b) create a geohash
column by default which encapsulates the order in which inputs were processed?
Example:
library(data.table)
library(geohash)
DT = data.table(
start = c("w21zux", "w21z6u", "w21ztq", "w21zkq", "w21z8s"),
stop = c("w21ztq", "w21zkq", "w21z8s", "w21z7q", "w21xxs")
)
# all available geohashes to decode
all_geos = DT[ , unique(c(start, stop))]
geo_xy = gh_decode(all_geos)
# prepare for data.table join syntax
setDT(geo_xy)
geo_xy[ , geo := all_geos] # <- this step seems overkill
# first join
DT[geo_xy, c('start_lat', 'start_lon') := .(i.lat, i.lng),
on = c(start = 'geo')]
# second join
DT[geo_xy, c('stop_lat', 'stop_lon') := .(i.lat, i.lng),
on = c(stop = 'geo')]
DT[]
# start stop start_lat start_lon stop_lat stop_lon
# 1: w21zux w21ztq 1.403503 103.9142 1.354065 103.9471
# 2: w21z6u w21zkq 1.299133 103.8373 1.310120 103.9032
# 3: w21ztq w21z8s 1.354065 103.9471 1.343079 103.7384
# 4: w21zkq w21z7q 1.310120 103.9032 1.310120 103.8593
# 5: w21z8s w21xxs 1.343079 103.7384 1.343079 103.6945
In this case, it's just an inconvenience, but there are more dynamic use cases when it may be hard for the user to understand the order of output, and the convenience of this feature would be all the more.
IIUC would be as simple as changing this command:
return DataFrame::create(_["gh"] = hashes,
_["lat"] = lats,
_["lng"] = lngs,
_["lat_error"] = lat_error,
_["lng_error"] = lng_error);
And adding tests; happy to file a PR if you agree it's useful/approve of the API
would be nice to have gh_neighbors
in addition to gh_neighbours
, etc. in the NAMESPACE
.
i'll file a PR to this end unless there's opposition
use lwgeom::geohash
devtools::install_github("ironholds/geohash")
-- don't. The package is not maintained
Just leaving it here for everyone facing the same issue.
Fix the GCC7 bug
It will be very useful given a geohash to have a function that return the bounding box (in lat,long)
It might be nice to allow gh_neighbors to more flexibly return not just the neighbors, but the neighbors' neighbors (up to distinctness, i.e. for n_neighbors = k
there should be (k+2)^2 - 1
geohashes returned), e.g.
# would return the same information as does currently gh_neighbours('dqcjqc')
gh_neighbours('dqcjqc', n_neighbors = 1)
# would return 24 geohashes
gh_neighbours('dqcjqc', n_neighbors = 2)
The 24 GH returned would be:
1 . NW-NW
- dqcjq7
2. NW-N
- dqcjqe
3. N-N
- dqcjqg
4. N-NE
- dqcjr5
5. NE-NE
- dqcjr7
6. NE-E
- dqcjr6
7. E-E
- dqcjr3
8. SE-E
- dqcjr2
9. SE-SE
- dqcjpr
10. SE-S
- dqcjpp
11. S-S
- dqcjnz
12. SW-S
- dqcjnx
13. SW-SW
- dqcjnr
14. SW-W
- dqcjq2
15. W-W
- dqcjq3
16. NW-W
- dqcjq6
17. NW
- dqcjqd
18. N
- dqcjqf
19. NE
- dqcjr4
20. E
- dqcjr1
21. SE
- dqcjr0
22. S
- dqcjqb
23. SW
- dqcjq8
24. W
- dqcjq9
A few potential issues:
NW-W
is the same as W-NW
pygeohash have a function to calculate like this
geohash_approximate_distance(geohash_1, geohash_2, check_validity=False)
Returns the approximate great-circle distance between two geohashes in meters.
:param geohash_1:
:param geohash_2:
:return:
L
is not a registered component of a geohash, so the following should be NA
(and/or issue a warning/error):
gh_decode('NULL')
# lat lng lat_error lng_error
# 1 -66.70898 129.5508 0.08789062 0.1757812
My C++ is weak but I guess base32_codes_index_of
returns something incorrect when the character is not in base32_codes
.
The package appears to have been removed from CRAN:
https://cran.r-project.org/web/packages/geohash/index.html
Package ‘geohash’ was removed from the CRAN repository.
Formerly available versions can be obtained from the archive.
Archived on 2018-10-27 as memory-access errors were not corrected despite multiple reminders over several months.
@Ironholds any more details from their e-mails?
It would be very useful to have a function that returns the geohash centroid.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.