deng0515001 / lnglat2geo Goto Github PK
View Code? Open in Web Editor NEW经纬度转省市区县乡镇离线包,采用空间查询算法,速度快(单线程5w次/s),省市区县100%准确率。
经纬度转省市区县乡镇离线包,采用空间查询算法,速度快(单线程5w次/s),省市区县100%准确率。
System.out.println(GeoTrans.determineAdmin(80.540732, 40.79937, CoordinateSystem.BD09(), false));
能不能构建s2cellId, 级别设置成11,创建s2cell,这样一个S2cell大概90平方千米,国内大约构建6000个有效S2cell,然后分别计算这些块和那些省市县相交。
用map映射s2cellId和List
查询时把经纬度转成s2cellId,然后具体判断
之后就用经纬度查s2cellId, 然后遍历List,我感觉这样实际比较的S2Polygon最多几个。我这边30多个省级别的S2Polygon,平均遍历调用contain判断一次速度应该在0.003毫秒左右。
但是构建s2cellId有没有啥好方式。
请问有海外的数据吗,通过海外的经纬可以转化为省市区吗
为了提速.我这写了多线程..但是源码是scala写的..看不太懂..不确定是不是线程安全.在这问下.
init time:46567 (初始化)
trans time:467 (解析时间)
Admin(**,北京市,北京城区,海淀区,清河街道,street,CN,110000,110100,110108,11010812,Location(116.298262,39.95993))
不知道首页的5w/s是如何得到了,是不是用java调耗时会增加
您好,我在windows10、 idea中创建test.scala并执行
val admin=determineAdmin(115.11632,40.604412,CoordinateSystem.WGS84,true)
println(admin.toString)
println(admin.city)
println(admin.cityCode)
println(admin.province)
println(admin.provinceCode)
得到了下面的异常
Exception in thread "main" java.io.InvalidClassException: scala.collection.immutable.List$SerializationProxy; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7905219378619747021
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2091)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1653)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)
at com.dengxq.lnglat2Geo.utils.ObjectSerializer$.deserialize(ObjectSerializer.scala:21)
at com.dengxq.lnglat2Geo.build.AdminDataProvider$AdminLoader$.loadAdminData(AdminDataProvider.scala:149)
at com.dengxq.lnglat2Geo.GeoTransImpl$.adminData$lzycompute(GeoTransImpl.scala:18)
at com.dengxq.lnglat2Geo.GeoTransImpl$.adminData(GeoTransImpl.scala:18)
at com.dengxq.lnglat2Geo.GeoTransImpl$.determineAdmin(GeoTransImpl.scala:54)
at com.dengxq.lnglat2Geo.GeoTrans$.determineAdmin(GeoTrans.scala:34)
at myspark.core.test$.main(test.scala:55)
at myspark.core.test.main(test.scala)
hi,感谢作者的实现。为了方便纯java同学的使用和定制,我将本项目翻译为java纯java代码。
在此基础之上做了下面的优化:
我换了一台电脑后,执行从一个DataFrame中,使用map按行读取数据,然后对每一行的经纬度字段应用方法determineAdmin获取省份、城市,当我使用limit从DataFrame中取8条以下的数据时很快可以输出结果,但是当我使用limit(8)、limit(10)或更大的数字时,会卡在某个地方,日志会输出Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped,
在本机中sparkConf设置为.setMaster("local[*]")
还没有尝试提交到服务器运行是否会遇到同样的问题
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(0)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 0
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 0
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(35)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 35
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 35
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(32)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 32
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 32
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(33)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 33
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 33
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(3)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 3
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 3
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanBroadcast(0)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning broadcast 0
2021-09-29 11:18:11,423 DEBUG TorrentBroadcast:61 Spark Context Cleaner 10046 - Unpersisting TorrentBroadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-9 10046 - removing broadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing broadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG MemoryStore:61 block-manager-slave-async-thread-pool-9 10046 - Block broadcast_0_piece0 of size 40552 dropped from memory (free 380968049)
2021-09-29 11:18:11,423 INFO BlockManagerInfo:57 dispatcher-event-loop-2 10046 - Removed broadcast_0_piece0 on 10.211.55.3:64053 in memory (size: 39.6 KB, free: 364.1 MB)
2021-09-29 11:18:11,423 DEBUG BlockManagerMaster:61 block-manager-slave-async-thread-pool-9 10046 - Updated info of block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Told master about block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing block broadcast_0
2021-09-29 11:18:11,423 DEBUG MemoryStore:61 block-manager-slave-async-thread-pool-9 10046 - Block broadcast_0 of size 400984 dropped from memory (free 381369033)
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-11 10046 - Done removing broadcast 0, response is 0
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-11 10046 - Sent response: 0 to 10.211.55.3:64022
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaned broadcast 0
2021-09-29 11:18:16,721 WARN ProcfsMetricsGetter:69 driver-heartbeater 15344 - Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.