Git Product home page Git Product logo

lnglat2geo's People

Contributors

deng0515001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

lnglat2geo's Issues

Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped

我换了一台电脑后,执行从一个DataFrame中,使用map按行读取数据,然后对每一行的经纬度字段应用方法determineAdmin获取省份、城市,当我使用limit从DataFrame中取8条以下的数据时很快可以输出结果,但是当我使用limit(8)、limit(10)或更大的数字时,会卡在某个地方,日志会输出Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped,

在本机中sparkConf设置为.setMaster("local[*]")

还没有尝试提交到服务器运行是否会遇到同样的问题

2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(0)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 0
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 0
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(35)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 35
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 35
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(32)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 32
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 32
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(33)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 33
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 33
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanAccum(3)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning accumulator 3
2021-09-29 11:18:11,423 INFO ContextCleaner:57 Spark Context Cleaner 10046 - Cleaned accumulator 3
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Got cleaning task CleanBroadcast(0)
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaning broadcast 0
2021-09-29 11:18:11,423 DEBUG TorrentBroadcast:61 Spark Context Cleaner 10046 - Unpersisting TorrentBroadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-9 10046 - removing broadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing broadcast 0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG MemoryStore:61 block-manager-slave-async-thread-pool-9 10046 - Block broadcast_0_piece0 of size 40552 dropped from memory (free 380968049)
2021-09-29 11:18:11,423 INFO BlockManagerInfo:57 dispatcher-event-loop-2 10046 - Removed broadcast_0_piece0 on 10.211.55.3:64053 in memory (size: 39.6 KB, free: 364.1 MB)
2021-09-29 11:18:11,423 DEBUG BlockManagerMaster:61 block-manager-slave-async-thread-pool-9 10046 - Updated info of block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Told master about block broadcast_0_piece0
2021-09-29 11:18:11,423 DEBUG BlockManager:61 block-manager-slave-async-thread-pool-9 10046 - Removing block broadcast_0
2021-09-29 11:18:11,423 DEBUG MemoryStore:61 block-manager-slave-async-thread-pool-9 10046 - Block broadcast_0 of size 400984 dropped from memory (free 381369033)
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-11 10046 - Done removing broadcast 0, response is 0
2021-09-29 11:18:11,423 DEBUG BlockManagerSlaveEndpoint:61 block-manager-slave-async-thread-pool-11 10046 - Sent response: 0 to 10.211.55.3:64022
2021-09-29 11:18:11,423 DEBUG ContextCleaner:61 Spark Context Cleaner 10046 - Cleaned broadcast 0
2021-09-29 11:18:16,721 WARN ProcfsMetricsGetter:69 driver-heartbeater 15344 - Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped

java版本的实现

hi,感谢作者的实现。为了方便纯java同学的使用和定制,我将本项目翻译为java纯java代码。

在此基础之上做了下面的优化:

  1. 序列化和反序列化
  2. Leb128
  3. 二级缓存(初始化加速 非首次启动6.5秒,首次20s)
  4. 更好的编辑和升级离线数据包

https://github.com/virjar/geoLibChina

用java调scala jar包耗时需要400-800毫秒

init time:46567 (初始化)
trans time:467 (解析时间)
Admin(**,北京市,北京城区,海淀区,清河街道,street,CN,110000,110100,110108,11010812,Location(116.298262,39.95993))

不知道首页的5w/s是如何得到了,是不是用java调耗时会增加

Exception in thread "main" java.io.InvalidClassException: scala.collection.immutable.List$SerializationProxy; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7905219378619747021

您好,我在windows10、 idea中创建test.scala并执行
val admin=determineAdmin(115.11632,40.604412,CoordinateSystem.WGS84,true)
println(admin.toString)
println(admin.city)
println(admin.cityCode)
println(admin.province)
println(admin.provinceCode)

得到了下面的异常

Exception in thread "main" java.io.InvalidClassException: scala.collection.immutable.List$SerializationProxy; local class incompatible: stream classdesc serialVersionUID = 1, local class serialVersionUID = -7905219378619747021
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2001)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1848)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2158)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2403)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2327)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2185)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1665)
at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2091)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1653)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:459)
at com.dengxq.lnglat2Geo.utils.ObjectSerializer$.deserialize(ObjectSerializer.scala:21)
at com.dengxq.lnglat2Geo.build.AdminDataProvider$AdminLoader$.loadAdminData(AdminDataProvider.scala:149)
at com.dengxq.lnglat2Geo.GeoTransImpl$.adminData$lzycompute(GeoTransImpl.scala:18)
at com.dengxq.lnglat2Geo.GeoTransImpl$.adminData(GeoTransImpl.scala:18)
at com.dengxq.lnglat2Geo.GeoTransImpl$.determineAdmin(GeoTransImpl.scala:54)
at com.dengxq.lnglat2Geo.GeoTrans$.determineAdmin(GeoTrans.scala:34)
at myspark.core.test$.main(test.scala:55)
at myspark.core.test.main(test.scala)

关于另一种方式

能不能构建s2cellId, 级别设置成11,创建s2cell,这样一个S2cell大概90平方千米,国内大约构建6000个有效S2cell,然后分别计算这些块和那些省市县相交。

用map映射s2cellId和List

查询时把经纬度转成s2cellId,然后具体判断
之后就用经纬度查s2cellId, 然后遍历List,我感觉这样实际比较的S2Polygon最多几个。我这边30多个省级别的S2Polygon,平均遍历调用contain判断一次速度应该在0.003毫秒左右。

但是构建s2cellId有没有啥好方式。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.