marook / osm-read Goto Github PK

an openstreetmap XML and PBF data parser for node.js and the browser

License: GNU Lesser General Public License v3.0

HTML 1.52% JavaScript 98.22% Shell 0.26%

osm-read's Issues

pbfParser in node.js and "TypeError: Invalid non-string/buffer chunk"

Has anyone been able to use osm-read-pbf from node.js? By using the same code I was running successfully in the browser I get a "TypeError: Invalid non-string/buffer chunk" before any nodes, ways or relations are read (ie probably right after it starts reading the pbf file). If needed I can upload the pbf file I'm using but I'm guessing the problem is already there with example/test.pbf.

Too Much Recursion

Hi,

I've been playing with the PBF parser for reading OSM data into a (experimental) web app. However, reading large files results in a "too much recursion error"

too much recursion osm-read-pbf.js:2269

This happens, for example, with the South Yorkshire PBF available here.

Is it possible to fix this, or is it an intrinsic problem with trying to read PBF in Javascript :)

Invalid typed array length in arrayBufferReader Line 12

Hi,
I want to parse a pbf generated by my geoserver and get a "Invalid typed array length in arrayBufferReader Line 12" the size variable in line 11 is 445776650.

I tried the same with a Mapbox pbf -> same issue.

Any ideas?

Add browser support to PBF parser

As discussed in #3, support for using pbfParser.js in the browser will be added by separating Node.js dependencies into an abstraction layer. This will be done in step-by-step pull requests, collected in this issue:

Vulnerability in protobuf dependencies

This seems to be already work in progress given a1530bf. Would you please post an update here when the fixed version gets published to npm?

Allocation failed - process out of memory

Hi,

I'd like to use this module to gather data from the pbf file extract for europe (http://download.geofabrik.de/europe-latest.osm.pbf, 20Go), but the process quickly run out of memory and fail.
I tried to increase the memory available for nodejs to 8Go (--max-old-space-size=8192) but it fails anyway. Is there a way to parse large files with this module?

Here is the output:

<--- Last few GCs --->

  264433 ms: Scavenge 8195.2 (8358.6) -> 8195.2 (8358.6) MB, 8.8 / 0 ms (+ 1.8 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
  270716 ms: Mark-sweep 8195.2 (8358.6) -> 8189.8 (8354.6) MB, 6283.4 / 0 ms (+ 2.5 ms in 2 steps since start of marking, biggest step 1.8 ms) [last resort gc].
  276850 ms: Mark-sweep 8189.8 (8354.6) -> 8190.2 (8358.6) MB, 6134.3 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x12b629ab4629 <JS Object>
    1: node [/osmread-js/pbfTest.js:~24] [pc=0x17d45f2c517c] (this=0x2360fd08bf11 <an Object with map 0xbe1c1a7d6d9>,node=0x179a95efcaf9 <an Object with map 0xbe1c1a6a8b1>)
    2: visitPrimitiveGroup(aka visitPrimitiveGroup) [/osmread-js/lib/pbfParser.js:~158] [pc=0x17d45f2f64d7] (this=0x12b629a041b9 <undefined>,pg=0x2355483166a1 <JS Object>,o...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Abandon (core dumped)

nodejs version: 4.2.6
cmd: nodejs --max-old-space-size=8192 ./pbfTest.js

Thanks.

Pause/Resume

Hey Markus,

As we discussed briefly it would be advantageous for me if we could introduce a pause/resume feature to osm-read.

The use-case for this is (for instance) when a consuming service (such as a database) is becoming flooded by requests, you can either buffer those requests in-memory or ask the parser to slow down or stop for a short while.

Buffering in-memory can be problematic when the dataset is very large (ie the planet file) and flood control mechanisms are very important for streaming interfaces.

The way I see it; this can be achieved a couple of different ways:

deferred recursion

Since visitNextBlock is called recursively:

when pause() is called, recursion is stopped
when resume() is called, recursion is started again

explicit `next()`

The consuming service must call next() otherwise the iterator will not advance.

node: function(node,next){
    console.log('node: ' + JSON.stringify(node));
    next(); // this triggers the next recursion
}

Either way we will need to add pause() and resume() methods to the public API.

I'll leave this issue open so we can discuss it further.

Get tag "center" from overpass-api response xml

I have the following result after the call to overpass-api (for output 'way' i use 'out center;' command):

<way id="43989209">
    <center lat="68.9280397" lon="33.1139458"/>
    <nd ref="559363044"/>
    <nd ref="559362513"/>
    <nd ref="559362515"/>
    <nd ref="559362512"/>
    <nd ref="559363044"/>
    <tag k="addr:city" v="Мурманск"/>
    <tag k="addr:housenumber" v="110"/>
    <tag k="addr:street" v="Кольский проспект"/>
    <tag k="building" v="yes"/>
    <tag k="name" v="Олимп Авто"/>
    <tag k="shop" v="car"/>
    <tag k="website" v="http://olimp-avto.lada.ru/"/>
  </way>

Unfortunately, callback 'way' is not getting this tag... You may suggest a solution to this problem?

Example pbf.html is Broken

It is not possible to run the example/pbf.html.

I followed the instructions and run npm run browserify to create the file osm-read-pbf.js. When opening the example HTML it will throw the following errors:

Uncaught Error: Cannot find module 'bytebuffer' osm-read-pbf.js:1
Uncaught ReferenceError: pbfParser is not defined pbf.html:18
GET http://localhost:8080/inflate.min.js.map 404 (Not Found)

number instead of string for id, uid and ref?

Is there a reason why id, uid and way node/relation member id refs are returned as string instead of number?

The PBF Format defines them all as numbers:

int64 id
int32 uid
sint64 refs
sint64 memids

I think memory usage could be reduced by using numbers instead of strings.

TypeError: Cannot read property 'position' of undefined

I have tried to run the simple code on a file stored here: http://planet.openstreetmap.nl/benelux/

I run the app using 'node app.js' and, after a few 2,5 seconds the current error message appears.

Do you have any idea what should I do?

Thanks in advance for your help

Error message

node_modules/osm-read/lib/pbfParser.js:419
            return readPBFElement(fd, fileBlock.blobHeader.position, fileBlock
                                                          ^
TypeError: Cannot read property 'position' of undefined
    at readBlob (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:419:59)
    at Object.readBlock (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:423:20)
    at Object.osmread.createPbfParser.callback (/Users/aboujraf/git/bitbucket/openstreetmap/app/app.js:13:16)
    at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:463:25
    at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:442:16
    at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:132:28
    at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:101:20
    at readPBFElementFromBuffer (/Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:63:12)
    at /Users/aboujraf/git/bitbucket/openstreetmap/node_modules/osm-read/lib/pbfParser.js:80:16
    at Object.wrapper [as oncomplete] (fs.js:454:17)
[1]+  Done                    clear

source code: app.js

'use strict';

var osmread = require('osm-read');

    osmread.createPbfParser({
    filePath: '/planet-benelux-131006.osm.pbf',
    callback: function(err, parser){
        if(err){
            // TODO handle error
        }

        parser.readBlock(parser.findFileBlocksByBlobType('OSMHeader'), function(err, block){
            console.log('header block');
            console.log(block);

            parser.close(function(err){
                if(err){
                    // TODO handle error
                }
            });
        });
    }
});

Is it possible to do any progress indication

I cant find the way to output something like 'processed 34534 of 998798 blocks (15%)'

Relations

Hi,
Is there a reason why you don't handle OSM relations?

It's a big issue, since some buildings are created with ways and some others with relations.

Relation members output format

The relation members format introduced in #17 uses a separate array for each member type:
relationsMembers = { nodes: [], ways: [] };

The problem is, that the members of a relation are an ordered list, regardless of type, that cannot be reconstructed from this output. That is, this output format loses information.

Therefore I suggest to use a single members array where each element contains a type property with one of node, way or relation as value. This is what other libs use as well: openstreetmap-json-schema (example), Overpass API JSON, osmtogeojson.

What do you think?

lat lon are not in WSG84 format

e.g.
node: {"id":"148133746","lat":0.0038205,"lon":-0.0039445,"tags":{},"version":4,"changeset":19425811,"uid":"741163","user":"JaLooNz"}

Why not store IDs as BigInt?

Once I obtain the records containing string ids (including the referenced nodes in the ways) I create new BigInt objects to replace their string representations.

Has any consideration been made of parsing them into BigInt values within osm-read?

Perhaps it would be a useful option to have if it were not to be done by default. Making it an option would avoid breaking changes for those who expect string values.

Would exposing an async iterator API suit this library?

I have used the random access features in osm-read to make a different way to iterate through records in the osm.pbf file. The problem I was having before was that since reading this OSM was faster than inserting it into SQLite, the reading got ahead of writing, and pause was not working (as I expected whereby it would immediately pause the output). I decided to implement an asynchronous iterator interface, not within the codebase of osm-read but using its API.

The code I use to iterate objects is as follows:

let c = 0;
let type_counts = {};
for await (const item of reader.objects) {
    //console.log(item);
    const {type} = item;
    if (!type_counts[type]) {
        type_counts[type] = 1
    } else {
        type_counts[type]++;
    }
    c++;
}
console.log('c', c);
console.log('type_counts', type_counts);

For guernsey-and-jersey I get the following output:

c 513686
type_counts { node: 461240, way: 51971, relation: 475 }

The code I have here is concise, will run quickly when the results are requested quickly, but will also run as slow as needed when the result processing takes more time.

Is this a feature that's worth incorporating into the library?

I would like to coordinate with @marook regarding including this and possibly other features in osm-read.

Any idea what might be causing this @marook?

path/osm-read/lib/nodejs/buffer.js:38
    for(offset = 0; offset < from.byteLength - 1; ++offset){
                                 ^
TypeError: Cannot read property 'byteLength' of undefined
    at Object.blobDataToBuffer (path/osm-read/lib/nodejs/buffer.js:38:34)
    at Object.inflateBlob (path/osm-read/lib/nodejs/zlib.js:5:22)
    at path/osm-read/lib/pbfParser.js:480:22
    at Object.readPBFElementFromBuffer (path/osm-read/lib/nodejs/buffer.js:15:12)
    at path/osm-read/lib/nodejs/fsReader.js:36:20
    at Object.wrapper [as oncomplete] (fs.js:454:17)

Module not found: Error: Can't resolve '../../node_modules/zlibjs/bin/inflate.min.js'

How do I fix this issue? I want to use this lib in Angular via web-pack

Tags unresolved

When I run the example/pbf.html the tags of any object is not computed. seem to stay in bytes.
Seems the same for user

{
  "id": "275452090",
  "lat": 51.5075933,
  "lon": -0.1076186,
  "tags": {
    "110,97,109,101": "74,97,109,39,115,32,83,97,110,100,119,105,99,104,32,66,97,114",
    "97,109,101,110,105,116,121": "99,97,102,101"
  },
  "version": 3,
  "timestamp": 1256818475000,
  "changeset": 2980587,
  "uid": "1697",
  "user": "110,105,99,107,98"
}

marook / osm-read Goto Github PK

osm-read's Issues

deferred recursion

explicit next()

Recommend Projects

Recommend Topics

Recommend Org

explicit `next()`