paulmach / osm Goto Github PK
View Code? Open in Web Editor NEWGeneral purpose library for reading, writing and working with OpenStreetMap data
License: MIT License
General purpose library for reading, writing and working with OpenStreetMap data
License: MIT License
OSM XML files have a <bounds>
as one of the first elements. This is not currently exposed by the osmxml.Scanner
. I think there are two ways of doing this:
Object
returns an interface{}
instead of osm.Object
, so it can return *osm.Bounds
Bounds
method, a bit like osmpbf.Scanner.Header
. This might be nicer from an API point of view, but introduces significantly more complexity. E.g. the osmxml.Scanner
now needs two states: the state when it is scanning for the <bounds>
and the state where it is doing regular scanning with Scan
. This is tricky to get right because it's not well-defined where in the document the <bounds>
element should be. We could define it as "before an element recognised by the Scan
method. But this would require some lookahead (in case the element doesn't exist. (an option is to require all documents to contain a <bounds>
element at a particular location).@paulmach have you thought about this at all?
For reference, an example OSM XML file with <bounds>
:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.7 8b86ff77">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2019-09-12T17:16:02Z"/>
<bounds minlat="51.4072000" minlon="-0.2907000" maxlat="51.5913000" maxlon="0.1659000"/>
<node id="1" lat="51.4779481" lon="-0.0014863" version="20" timestamp="2018-10-31T10:20:19Z" changeset="64040630" uid="1778799" user="SomeoneElse_Revert"/>
<node id="78112" lat="51.5269760" lon="-0.1457924" version="3" timestamp="2018-10-15T14:49:44Z" changeset="63545352" uid="486343" user="peregrination"/>
...
Hello and thank you for this library.
In libosmium, osmium::io::Reader
can be initialized with osmium::osm_entities::bits
to specify which OSM entities should be parsed and passed to the handler (docs, example). Specifying only the entities you want to handle does significantly improve read performance in my experience with libosmium.
I believe it can be done without changing the current behavior by adding new initialization function, perhaps osmpbf.NewWithEntities
.
I appreciate your consideration.
Decoding OSM PBF files, which contain 'dense nodes' fails:
panic: dense node does not include all required fields
goroutine 1 [running]:
main.main()
/Users/[user]/repos/[repo]/cmd/importer/main.go:76 +0x468
exit status 2
The error occurs within the example code provided in the README.md of the osmpbf package.
The error does not occur with OSM PBF file, which do not contain 'dense nodes' like Antarctica
from Geofabrik
Is this a bug or have I missed anything?
After some research, it seems to me that another library just ignores the missing fields and the decoding suceeds. Thus, the issue seems to be closely related to my OSM PBF file (which I can unfortunately not share due to its size). It would be interesting to hear if your implementation supports a less strict mode, which also just ignores these fields.
Is there a specific reason to have in osmpbf part of the project n output chanels and feeding them in order to a serializer chanel ? (decode.go function line number
The upside is it is making testing much easier since we get predictable outputs, however in my case I do not see not a reason that my application needs a special ordering of the elements : Do I miss something important why one should process objects in order ? (In other words: why doesn't the deocoder directly push to serializer)
This is just a question and not an issue.
I understand how OSM data is structured. I use the scanner to extract relations, ways and nodes:
func main() {
r, err := os.Open("/tmp/myarea.osm.pbf")
if err != nil {
panic(err)
}
background := context.Background()
scanner := osmpbf.New(background, r, 0)
for scanner.Scan() {
o := scanner.Object()
switch o.(type) {
case *osm.Relation:
// relation
case *osm.Way:
// way
case *osm.Node:
// node
}
}
}
I thought using maps to store IDs and so on. But I had the feeling that someone already wrote a solution for this.
Are there any best practices, paths or library APIs to get an orb.Geometry
-struct of a relation read from a .pbf file?
The osmgeojson package currently doesn't support converting the output of overpass queries (outputted as geom) to geojson. An example:
If you use overpass to request the geometry of manchester as following:
area["ISO3166-1:alpha2"~"^gb$",i]["admin_level"=2]->.a;
(
relation(area.a)[name~"^.*manchester.*$",i][admin_level=8];
);
out geom;
The result is an osm xml with 1 relation, Manchester, containing members (interesting nodes and all ways that are needed to create the geometry of Manchester.
Currently the library needs all ways and nodes that are member of a relation to be separately defined in the osm xml.
If you find the time!
The version information is encoded as JSON number instead of a string by Overpass API. This causes unmarshaling it to fail with error json: cannot unmarshal number into Go struct field .version of type string
Take, for example, the minimal JSON output of overpass (http://overpass-api.de/api/interpreter?data=[out:json];out;), e.g.
{
"version": 0.6,
"generator": "Overpass API 0.7.59 e21c39fe",
"osm3s": {
"timestamp_osm_base": "2022-11-17T17:14:39Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": []
}
Minimum viable example failure: https://go.dev/play/p/n7kBY3CCbHC
package main
import (
"encoding/json"
"fmt"
"github.com/paulmach/osm"
)
func main() {
var overpass osm.OSM
buf := []byte(`{
"version": 0.6,
"generator": "Overpass API 0.7.59 e21c39fe",
"osm3s": {
"timestamp_osm_base": "2022-11-17T17:14:39Z",
"copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
},
"elements": []
}`)
err := json.Unmarshal(buf, &overpass)
if err != nil {
fmt.Println(err)
}
}
The OSM standard (XML) format is described here: https://wiki.openstreetmap.org/wiki/OSM_XML
Example (OSM standard):
<way id="750000000000">
<nd ref="750000000000"/>
<nd ref="750000000001"/>
<nd ref="750000000002"/>
<nd ref="750000000003"/>
<nd ref="750000000000"/>
<tag k="ele" v="20"/>
<tag k="contour" v="elevation"/>
<tag k="contour_ext" v="elevation_minor"/>
<tag k="ed20" v="20"/>
<tag k="ed10" v="10"/>
</way>
The Golang standard XML encoder (xml.MarshalIndent(osmWay, " ", " ") leads to this:
<way id="750000000000" user="" uid="0" visible="true" version="0" changeset="0" timestamp="1970-01-01T00:00:00Z">
<nd ref="750000000000"></nd>
<nd ref="750000000001"></nd>
<nd ref="750000000002"></nd>
<nd ref="750000000003"></nd>
<nd ref="750000000000"></nd>
<tag k="ele" v="20"></tag>
<tag k="contour" v="elevation"></tag>
<tag k="contour_ext" v="elevation_minor"></tag>
<tag k="ed20" v="20"></tag>
<tag k="ed10" v="10"></tag>
</way>
Some tools (e.g. osmconvert) are not working with the Golang generated XML format.
Question: How can the 'problem' with addition closing XML tags be solved?
https://pkg.go.dev/github.com/paulmach/osm does not show version v0.2
I presume this is because v0.2 does not follow semantic versioning (need string tag v0.2.0 afaik )
Hi there. First off, thanks for this great library.
When trying to scan a 700Mb *osm.pbf file, the mem usage spikes to over 145Gb with just a single go routine. I know this because I am trying this task in an AWS CodeBuild project with max memory allowed (145 Gb) and it still gets killed.
This is surprising considering that the Osmosis CLI tool can accomplish the same task (scanning the 700Mb pbf file and filtering by a bounding box) without breaching 7Gb of memory.
Why does this tool take over 20 times more memory than Osmosis, and is there anything we can do to improve this? I'm willing to help if it's a big task.
Thanks.
I am trying to get the polygon coordinates from Osm Ways and Relations. Both seem to have the same behavior. Below I have posted my code. Is there anything that I am doing wrong or why would it not output the Lat and Lon?
Thanks
func main() {
f, err := os.Open("latest.osm.pbf")
if err != nil {
panic(err)
}
defer f.Close()
nodeChan := make(chan *osm.Node)
go nodes(nodeChan)
wayChan := make(chan *osm.Way)
go ways(wayChan)
relationChan := make(chan *osm.Relation)
go relations(relationChan)
scanner := osmpbf.New(context.Background(), f, 3)
defer scanner.Close()
for scanner.Scan() {
switch e := scanner.Object().(type) {
case *osm.Node:
//fmt.Println(e.Point())
//nodeChan <- e
//fmt.Println(e.TagMap())
case *osm.Way:
//wayChan <- e
fmt.Println(e.Nodes.Bound())
//fmt.Println("way")
case *osm.Relation:
//relationChan <- e
//fmt.Println("relation")
}
}
scanErr := scanner.Err()
if scanErr != nil {
panic(scanErr)
}
}
This returns...
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
Thanks for a great project.
There is a security issue detected by Dependabot gogo/protobuf#752.
All that is needed is to upgrade the gogo version to the latest one (v1.3.2 I believe) where this is fixed.
As a side note, the gogo project is not really maintained anymore. Has anyone tested the performance of using the standard proto package?
github.com/datadog/czlib should become github.com/DataDog/czlib.
Also the orb dependency could be updated
this package has a dependency (either directly, or somewhere upstream; I didn't investigate in detail) that will lead to this error on a clean Go lang install on M1 mac:
# pkg-config --cflags -- zlib
The recommended solution seems to be to use homebrew to install the pkg-config utility
brew install pkg-config
I'm leaving this issue in case it helps future Mac users.
Perhaps add an entry to the README:
##INSTALLATION - Apple Silicon Notes
This package require the "pkg-config" utility which must be installed on Mac using Homebrew. Install Homebrew then install the "pkg-config" package using this comnand:
brew install pkg-config
OSM nodes can contain tags which specify only a key but no value. For example "building=" is common. Tags.Find() should return [nil] if no key is found, since returning "" could mean that either the tag is missing or that it is present but empty. It is common to want to catch even empty tags.
When retrieving data for ways using the osmpbf package, the Way.Nodes array only contains node IDs, with all other fields such as Version, ChangesetID, Lat, and Lon being zero. This occurs despite expecting these fields to be populated with respective node data.
package main
import (
"context"
"fmt"
"os"
"time"
"github.com/paulmach/osm"
"github.com/paulmach/osm/osmpbf"
)
func main() {
oshFile, _ := os.Open("path/to/w128897900.osh.pbf")
pbfScanner := osmpbf.New(context.Background(), oshFile, 16)
for pbfScanner.Scan() {
osmItem := pbfScanner.Object()
if osmItem.ObjectID().Ref() != 128897900 {
continue
}
var way *osm.Way = osmItem.(*osm.Way)
var timeStamp, _ = time.Parse(time.RFC3339, "2011-12-20T10:59:34Z")
if way.Timestamp.Before(timeStamp) {
continue
}
for _, node := range way.Nodes {
fmt.Println(node.ID, node.Version, node.ChangesetID, node.Lat, node.Lon)
}
return
}
}
Each node retrieved in the Way.Nodes slice should have complete information, including ID, Version, ChangesetID, Latitude, and Longitude.
The output only includes node IDs, with other fields showing zero values:
1423178779 0 0 0 0
1476261275 0 0 0 0
1466158512 0 0 0 0
1476261292 0 0 0 0
1423178791 0 0 0 0
1481172885 0 0 0 0
...
This enables us to reimplement this script in go. Your library is a great candidate to use when working with osm data.
https://github.com/openstreetmap/osm2pgsql/blob/master/scripts/osm2pgsql-replication
I want to duplicate some OSM nodes in my local environment. I'm able to find these nodes. Example:
node/42358880:6
map[string]string{
"expected_rhn_route_relations":"3",
"expected_rwn_route_relations":"3",
"network:type":"node_network",
"rhn_ref":"62",
"rwn_ref":"21"
}
I want to write a similar node with a new ID into a PBF or XML file (for later merge). How to achieve this with this library?
Hello, I have a problem in unmarshalling one of the files that I previously entered in the form of xml in my program.
When I try to unmarshal an element that has tag:type for example :
"id": 13190773,
"tags": {
"type": "route",
}
I countered this error : unknown type of 'route'
Apparently there is a problem in recognizing types
Thanks in advance
It'd be great to be able to write back into osm.pbf
- this is mainly to be able edit data, filter data or simply create diffs between two osm.pbf
files. Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.