marook / osm-read Goto Github PK
View Code? Open in Web Editor NEWan openstreetmap XML and PBF data parser for node.js and the browser
License: GNU Lesser General Public License v3.0
an openstreetmap XML and PBF data parser for node.js and the browser
License: GNU Lesser General Public License v3.0
Has anyone been able to use osm-read-pbf from node.js? By using the same code I was running successfully in the browser I get a "TypeError: Invalid non-string/buffer chunk" before any nodes, ways or relations are read (ie probably right after it starts reading the pbf file). If needed I can upload the pbf file I'm using but I'm guessing the problem is already there with example/test.pbf
.
Hi,
I've been playing with the PBF parser for reading OSM data into a (experimental) web app. However, reading large files results in a "too much recursion error"
too much recursion osm-read-pbf.js:2269
This happens, for example, with the South Yorkshire PBF available here.
Is it possible to fix this, or is it an intrinsic problem with trying to read PBF in Javascript :)
Hi,
I want to parse a pbf generated by my geoserver and get a "Invalid typed array length in arrayBufferReader Line 12" the size variable in line 11 is 445776650.
I tried the same with a Mapbox pbf -> same issue.
Any ideas?
As discussed in #3, support for using pbfParser.js in the browser will be added by separating Node.js dependencies into an abstraction layer. This will be done in step-by-step pull requests, collected in this issue:
This seems to be already work in progress given a1530bf. Would you please post an update here when the fixed version gets published to npm?
Hi,
I'd like to use this module to gather data from the pbf file extract for europe (http://download.geofabrik.de/europe-latest.osm.pbf, 20Go), but the process quickly run out of memory and fail.
I tried to increase the memory available for nodejs to 8Go (--max-old-space-size=8192) but it fails anyway. Is there a way to parse large files with this module?
Here is the output:
<--- Last few GCs --->
264433 ms: Scavenge 8195.2 (8358.6) -> 8195.2 (8358.6) MB, 8.8 / 0 ms (+ 1.8 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
270716 ms: Mark-sweep 8195.2 (8358.6) -> 8189.8 (8354.6) MB, 6283.4 / 0 ms (+ 2.5 ms in 2 steps since start of marking, biggest step 1.8 ms) [last resort gc].
276850 ms: Mark-sweep 8189.8 (8354.6) -> 8190.2 (8358.6) MB, 6134.3 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x12b629ab4629 <JS Object>
1: node [/osmread-js/pbfTest.js:~24] [pc=0x17d45f2c517c] (this=0x2360fd08bf11 <an Object with map 0xbe1c1a7d6d9>,node=0x179a95efcaf9 <an Object with map 0xbe1c1a6a8b1>)
2: visitPrimitiveGroup(aka visitPrimitiveGroup) [/osmread-js/lib/pbfParser.js:~158] [pc=0x17d45f2f64d7] (this=0x12b629a041b9 <undefined>,pg=0x2355483166a1 <JS Object>,o...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Abandon (core dumped)
nodejs version: 4.2.6
cmd: nodejs --max-old-space-size=8192 ./pbfTest.js
Thanks.
Hey Markus,
As we discussed briefly it would be advantageous for me if we could introduce a pause/resume feature to osm-read
.
The use-case for this is (for instance) when a consuming service (such as a database) is becoming flooded by requests, you can either buffer those requests in-memory or ask the parser to slow down or stop for a short while.
Buffering in-memory can be problematic when the dataset is very large (ie the planet file) and flood control mechanisms are very important for streaming interfaces.
The way I see it; this can be achieved a couple of different ways:
Since visitNextBlock
is called recursively:
pause()
is called, recursion is stoppedresume()
is called, recursion is started againnext()
The consuming service must call next()
otherwise the iterator will not advance.
node: function(node,next){
console.log('node: ' + JSON.stringify(node));
next(); // this triggers the next recursion
}
Either way we will need to add pause()
and resume()
methods to the public API.
I'll leave this issue open so we can discuss it further.
I have the following result after the call to overpass-api (for output 'way' i use 'out center;' command):
<way id="43989209">
<center lat="68.9280397" lon="33.1139458"/>
<nd ref="559363044"/>
<nd ref="559362513"/>
<nd ref="559362515"/>
<nd ref="559362512"/>
<nd ref="559363044"/>
<tag k="addr:city" v="Мурманск"/>
<tag k="addr:housenumber" v="110"/>
<tag k="addr:street" v="Кольский проспект"/>
<tag k="building" v="yes"/>
<tag k="name" v="Олимп Авто"/>
<tag k="shop" v="car"/>
<tag k="website" v="http://olimp-avto.lada.ru/"/>
</way>
Unfortunately, callback 'way' is not getting this tag... You may suggest a solution to this problem?
It is not possible to run the example/pbf.html
.
I followed the instructions and run npm run browserify
to create the file osm-read-pbf.js
. When opening the example HTML it will throw the following errors:
Uncaught Error: Cannot find module 'bytebuffer' osm-read-pbf.js:1
Uncaught ReferenceError: pbfParser is not defined pbf.html:18
GET http://localhost:8080/inflate.min.js.map 404 (Not Found)
Is there a reason why id, uid and way node/relation member id refs are returned as string instead of number?
The PBF Format defines them all as numbers:
int64 id
int32 uid
sint64 refs
sint64 memids
I think memory usage could be reduced by using numbers instead of strings.
I have tried to run the simple code on a file stored here: http://planet.openstreetmap.nl/benelux/
I run the app using 'node app.js' and, after a few 2,5 seconds the current error message appears.
Do you have any idea what should I do?
Thanks in advance for your help
Error message
node_modules/osm-read/lib/pbfParser.js:419
return readPBFElement(fd, fileBlock.blobHeader.position, fileBlock
^
TypeError: Cannot read property 'position' of undefined
at readBlob (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:419:59)
at Object.readBlock (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:423:20)
at Object.osmread.createPbfParser.callback (/Users/aboujraf/git/bitbucket/openstreetmap/app/app.js:13:16)
at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:463:25
at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:442:16
at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:132:28
at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:101:20
at readPBFElementFromBuffer (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:63:12)
at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:80:16
at Object.wrapper [as oncomplete] (fs.js:454:17)
[1]+ Done clear
source code: app.js
'use strict';
var osmread = require('osm-read');
osmread.createPbfParser({
filePath: '/planet-benelux-131006.osm.pbf',
callback: function(err, parser){
if(err){
// TODO handle error
}
parser.readBlock(parser.findFileBlocksByBlobType('OSMHeader'), function(err, block){
console.log('header block');
console.log(block);
parser.close(function(err){
if(err){
// TODO handle error
}
});
});
}
});
I cant find the way to output something like 'processed 34534 of 998798 blocks (15%)'
Hi,
Is there a reason why you don't handle OSM relations?
It's a big issue, since some buildings are created with ways and some others with relations.
The relation members format introduced in #17 uses a separate array for each member type:
relationsMembers = { nodes: [], ways: [] };
The problem is, that the members of a relation are an ordered list, regardless of type, that cannot be reconstructed from this output. That is, this output format loses information.
Therefore I suggest to use a single members array where each element contains a type
property with one of node
, way
or relation
as value. This is what other libs use as well: openstreetmap-json-schema (example), Overpass API JSON, osmtogeojson.
What do you think?
e.g.
node: {"id":"148133746","lat":0.0038205,"lon":-0.0039445,"tags":{},"version":4,"changeset":19425811,"uid":"741163","user":"JaLooNz"}
Once I obtain the records containing string ids (including the referenced nodes in the ways) I create new BigInt objects to replace their string representations.
Has any consideration been made of parsing them into BigInt values within osm-read?
Perhaps it would be a useful option to have if it were not to be done by default. Making it an option would avoid breaking changes for those who expect string values.
I have used the random access features in osm-read to make a different way to iterate through records in the osm.pbf file. The problem I was having before was that since reading this OSM was faster than inserting it into SQLite, the reading got ahead of writing, and pause was not working (as I expected whereby it would immediately pause the output). I decided to implement an asynchronous iterator interface, not within the codebase of osm-read but using its API.
The code I use to iterate objects is as follows:
let c = 0;
let type_counts = {};
for await (const item of reader.objects) {
//console.log(item);
const {type} = item;
if (!type_counts[type]) {
type_counts[type] = 1
} else {
type_counts[type]++;
}
c++;
}
console.log('c', c);
console.log('type_counts', type_counts);
For guernsey-and-jersey I get the following output:
c 513686
type_counts { node: 461240, way: 51971, relation: 475 }
The code I have here is concise, will run quickly when the results are requested quickly, but will also run as slow as needed when the result processing takes more time.
Is this a feature that's worth incorporating into the library?
I would like to coordinate with @marook regarding including this and possibly other features in osm-read.
So far I have found primitivegroup to always be an array with a single item. Is this always the case?
I'm plagued by this error when dealing with large pbf
files.
...tried using the browser
zlib and buffer but got a similar error.
The pbf
files I'm using are the osm planet file ~25GB
and the geofabrik continent files ~15GB
.
I don't get any errors when using the mapzen metro extracts
.
Any idea what might be causing this @marook?
path/osm-read/lib/nodejs/buffer.js:38
for(offset = 0; offset < from.byteLength - 1; ++offset){
^
TypeError: Cannot read property 'byteLength' of undefined
at Object.blobDataToBuffer (path/osm-read/lib/nodejs/buffer.js:38:34)
at Object.inflateBlob (path/osm-read/lib/nodejs/zlib.js:5:22)
at path/osm-read/lib/pbfParser.js:480:22
at Object.readPBFElementFromBuffer (path/osm-read/lib/nodejs/buffer.js:15:12)
at path/osm-read/lib/nodejs/fsReader.js:36:20
at Object.wrapper [as oncomplete] (fs.js:454:17)
How do I fix this issue? I want to use this lib in Angular via web-pack
When I run the example/pbf.html the tags of any object is not computed. seem to stay in bytes.
Seems the same for user
{
"id": "275452090",
"lat": 51.5075933,
"lon": -0.1076186,
"tags": {
"110,97,109,101": "74,97,109,39,115,32,83,97,110,100,119,105,99,104,32,66,97,114",
"97,109,101,110,105,116,121": "99,97,102,101"
},
"version": 3,
"timestamp": 1256818475000,
"changeset": 2980587,
"uid": "1697",
"user": "110,105,99,107,98"
}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.