paulmach / osm Goto Github PK

View Code? Open in Web Editor NEW

369.0 12.0 48.0 11.37 MB

General purpose library for reading, writing and working with OpenStreetMap data

License: MIT License

Go 100.00%

golang osm openstreetmap osmpbf xml

osm's People

Contributors

Stargazers

Watchers

osm's Issues

Expose the <bounds> element in osmxml

OSM XML files have a <bounds> as one of the first elements. This is not currently exposed by the osmxml.Scanner. I think there are two ways of doing this:

Object returns an interface{} instead of osm.Object, so it can return *osm.Bounds
Add a new Bounds method, a bit like osmpbf.Scanner.Header. This might be nicer from an API point of view, but introduces significantly more complexity. E.g. the osmxml.Scanner now needs two states: the state when it is scanning for the <bounds> and the state where it is doing regular scanning with Scan. This is tricky to get right because it's not well-defined where in the document the <bounds> element should be. We could define it as "before an element recognised by the Scan method. But this would require some lookahead (in case the element doesn't exist. (an option is to require all documents to contain a <bounds> element at a particular location).

@paulmach have you thought about this at all?

For reference, an example OSM XML file with <bounds>:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.7 8b86ff77">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2019-09-12T17:16:02Z"/>

  <bounds minlat="51.4072000" minlon="-0.2907000" maxlat="51.5913000" maxlon="0.1659000"/>

  <node id="1" lat="51.4779481" lon="-0.0014863" version="20" timestamp="2018-10-31T10:20:19Z" changeset="64040630" uid="1778799" user="SomeoneElse_Revert"/>
  <node id="78112" lat="51.5269760" lon="-0.1457924" version="3" timestamp="2018-10-15T14:49:44Z" changeset="63545352" uid="486343" user="peregrination"/>
...

Feature request: Specify which OSM entities should be read from the file

Hello and thank you for this library.

In libosmium, osmium::io::Reader can be initialized with osmium::osm_entities::bits to specify which OSM entities should be parsed and passed to the handler (docs, example). Specifying only the entities you want to handle does significantly improve read performance in my experience with libosmium.

I believe it can be done without changing the current behavior by adding new initialization function, perhaps osmpbf.NewWithEntities.

I appreciate your consideration.

How to handle dense nodes?

Decoding OSM PBF files, which contain 'dense nodes' fails:

panic: dense node does not include all required fields

goroutine 1 [running]:
main.main()
        /Users/[user]/repos/[repo]/cmd/importer/main.go:76 +0x468
exit status 2

The error occurs within the example code provided in the README.md of the osmpbf package.

The error does not occur with OSM PBF file, which do not contain 'dense nodes' like Antarctica from Geofabrik

Is this a bug or have I missed anything?

After some research, it seems to me that another library just ignores the missing fields and the decoding suceeds. Thus, the issue seems to be closely related to my OSM PBF file (which I can unfortunately not share due to its size). It would be interesting to hear if your implementation supports a less strict mode, which also just ignores these fields.

Question: Why output and serializer chanel

Is there a specific reason to have in osmpbf part of the project n output chanels and feeding them in order to a serializer chanel ? (decode.go function line number

The upside is it is making testing much easier since we get predictable outputs, however in my case I do not see not a reason that my application needs a special ordering of the elements : Do I miss something important why one should process objects in order ? (In other words: why doesn't the deocoder directly push to serializer)

Question: How to get orb.Geometry of a relation from .pbf file?

This is just a question and not an issue.

I understand how OSM data is structured. I use the scanner to extract relations, ways and nodes:

func main() {
	r, err := os.Open("/tmp/myarea.osm.pbf")
	if err != nil {
		panic(err)
	}
	background := context.Background()
	scanner := osmpbf.New(background, r, 0)
	for scanner.Scan() {
		o := scanner.Object()
		switch o.(type) {
		case *osm.Relation:
			// relation
		case *osm.Way:
			// way
		case *osm.Node:
			// node
		}
	}
}

I thought using maps to store IDs and so on. But I had the feeling that someone already wrote a solution for this.
Are there any best practices, paths or library APIs to get an orb.Geometry-struct of a relation read from a .pbf file?

osmgeojson support overpass geometry output conversion

The osmgeojson package currently doesn't support converting the output of overpass queries (outputted as geom) to geojson. An example:

If you use overpass to request the geometry of manchester as following:

area["ISO3166-1:alpha2"~"^gb$",i]["admin_level"=2]->.a;
(
  relation(area.a)[name~"^.*manchester.*$",i][admin_level=8];
);
out geom;

The result is an osm xml with 1 relation, Manchester, containing members (interesting nodes and all ways that are needed to create the geometry of Manchester.

Currently the library needs all ways and nodes that are member of a relation to be separately defined in the osm xml.

If you find the time!

Unmarshaling of Overpass JSON fails on version field

The version information is encoded as JSON number instead of a string by Overpass API. This causes unmarshaling it to fail with error json: cannot unmarshal number into Go struct field .version of type string

Take, for example, the minimal JSON output of overpass (http://overpass-api.de/api/interpreter?data=[out:json];out;), e.g.

{
  "version": 0.6,
  "generator": "Overpass API 0.7.59 e21c39fe",
  "osm3s": {
    "timestamp_osm_base": "2022-11-17T17:14:39Z",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": []
}

Minimum viable example failure: https://go.dev/play/p/n7kBY3CCbHC

package main

import (
	"encoding/json"
	"fmt"

	"github.com/paulmach/osm"
)

func main() {
	var overpass osm.OSM
	buf := []byte(`{
  "version": 0.6,
  "generator": "Overpass API 0.7.59 e21c39fe",
  "osm3s": {
    "timestamp_osm_base": "2022-11-17T17:14:39Z",
    "copyright": "The data included in this document is from www.openstreetmap.org. The data is made available under ODbL."
  },
  "elements": []
}`)
	err := json.Unmarshal(buf, &overpass)
	if err != nil {

		fmt.Println(err)
	}
}

How to create the OSM standard (XML) format?

The OSM standard (XML) format is described here: https://wiki.openstreetmap.org/wiki/OSM_XML

Example (OSM standard):

<way id="750000000000">
 <nd ref="750000000000"/>
 <nd ref="750000000001"/>
 <nd ref="750000000002"/>
 <nd ref="750000000003"/>
 <nd ref="750000000000"/>
 <tag k="ele" v="20"/>
 <tag k="contour" v="elevation"/>
 <tag k="contour_ext" v="elevation_minor"/>
 <tag k="ed20" v="20"/>
 <tag k="ed10" v="10"/>
</way>

The Golang standard XML encoder (xml.MarshalIndent(osmWay, " ", " ") leads to this:

<way id="750000000000" user="" uid="0" visible="true" version="0" changeset="0" timestamp="1970-01-01T00:00:00Z">
 <nd ref="750000000000"></nd>
 <nd ref="750000000001"></nd>
 <nd ref="750000000002"></nd>
 <nd ref="750000000003"></nd>
 <nd ref="750000000000"></nd>
 <tag k="ele" v="20"></tag>
 <tag k="contour" v="elevation"></tag>
 <tag k="contour_ext" v="elevation_minor"></tag>
 <tag k="ed20" v="20"></tag>
 <tag k="ed10" v="10"></tag>
</way>

Some tools (e.g. osmconvert) are not working with the Golang generated XML format.

Question: How can the 'problem' with addition closing XML tags be solved?

-

Module not visible on pkg.go.dev

https://pkg.go.dev/github.com/paulmach/osm does not show version v0.2

I presume this is because v0.2 does not follow semantic versioning (need string tag v0.2.0 afaik )

Huge Memory Usage

Hi there. First off, thanks for this great library.

When trying to scan a 700Mb *osm.pbf file, the mem usage spikes to over 145Gb with just a single go routine. I know this because I am trying this task in an AWS CodeBuild project with max memory allowed (145 Gb) and it still gets killed.

This is surprising considering that the Osmosis CLI tool can accomplish the same task (scanning the 700Mb pbf file and filtering by a bounding box) without breaching 7Gb of memory.

Why does this tool take over 20 times more memory than Osmosis, and is there anything we can do to improve this? I'm willing to help if it's a big task.

Thanks.

Can't extract Lat and Lon from WayNodes

I am trying to get the polygon coordinates from Osm Ways and Relations. Both seem to have the same behavior. Below I have posted my code. Is there anything that I am doing wrong or why would it not output the Lat and Lon?

Thanks

func main() {
	f, err := os.Open("latest.osm.pbf")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	nodeChan := make(chan *osm.Node)
	go nodes(nodeChan)

	wayChan := make(chan *osm.Way)
	go ways(wayChan)

	relationChan := make(chan *osm.Relation)
	go relations(relationChan)

	scanner := osmpbf.New(context.Background(), f, 3)
	defer scanner.Close()

	for scanner.Scan() {
		switch e := scanner.Object().(type) {
		case *osm.Node:
			//fmt.Println(e.Point())
			//nodeChan <- e
			//fmt.Println(e.TagMap())
		case *osm.Way:
			//wayChan <- e
			fmt.Println(e.Nodes.Bound())
			//fmt.Println("way")
		case *osm.Relation:
			//relationChan <- e
			//fmt.Println("relation")
		}
	}

	scanErr := scanner.Err()
	if scanErr != nil {
		panic(scanErr)
	}
}

This returns...

{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}
{[0 0] [0 0]}

gogo/protobuf security issue

Thanks for a great project.

There is a security issue detected by Dependabot gogo/protobuf#752.

All that is needed is to upgrade the gogo version to the latest one (v1.3.2 I believe) where this is fixed.

As a side note, the gogo project is not really maintained anymore. Has anyone tested the performance of using the standard proto package?

Use correct capitalisation for DataDog dependency

github.com/datadog/czlib should become github.com/DataDog/czlib.

Also the orb dependency could be updated

pkg-config: exec: "pkg-config": executable file not found in $PATH. on M1 mac

this package has a dependency (either directly, or somewhere upstream; I didn't investigate in detail) that will lead to this error on a clean Go lang install on M1 mac:
# pkg-config --cflags -- zlib

The recommended solution seems to be to use homebrew to install the pkg-config utility
brew install pkg-config

I'm leaving this issue in case it helps future Mac users.
Perhaps add an entry to the README:

##INSTALLATION - Apple Silicon Notes
This package require the "pkg-config" utility which must be installed on Mac using Homebrew. Install Homebrew then install the "pkg-config" package using this comnand:
  brew install pkg-config

Tags.Find() should return nil not ""

OSM nodes can contain tags which specify only a key but no value. For example "building=" is common. Tags.Find() should return [nil] if no key is found, since returning "" could mean that either the tag is missing or that it is present but empty. It is common to want to catch even empty tags.

Way.Nodes only contains node IDs with no additional information

Description

When retrieving data for ways using the osmpbf package, the Way.Nodes array only contains node IDs, with all other fields such as Version, ChangesetID, Lat, and Lon being zero. This occurs despite expecting these fields to be populated with respective node data.

Steps to Reproduce

Use the following code snippet to extract way and node information from w128897900.osh.pbf.gz:

package main

import (
	"context"
	"fmt"
	"os"
	"time"

	"github.com/paulmach/osm"
	"github.com/paulmach/osm/osmpbf"
)

func main() {
	oshFile, _ := os.Open("path/to/w128897900.osh.pbf")
	pbfScanner := osmpbf.New(context.Background(), oshFile, 16)

	for pbfScanner.Scan() {
		osmItem := pbfScanner.Object()

		if osmItem.ObjectID().Ref() != 128897900 {
			continue
		}

		var way *osm.Way = osmItem.(*osm.Way)
		var timeStamp, _ = time.Parse(time.RFC3339, "2011-12-20T10:59:34Z")

		if way.Timestamp.Before(timeStamp) {
			continue
		}

		for _, node := range way.Nodes {
			fmt.Println(node.ID, node.Version, node.ChangesetID, node.Lat, node.Lon)
		}

		return
	}
}

Expected Behavior

Each node retrieved in the Way.Nodes slice should have complete information, including ID, Version, ChangesetID, Latitude, and Longitude.

Actual Behavior

The output only includes node IDs, with other fields showing zero values:

1423178779 0 0 0 0
1476261275 0 0 0 0
1466158512 0 0 0 0
1476261292 0 0 0 0
1423178791 0 0 0 0
1481172885 0 0 0 0
...

Environment

Library Version: v0.8.0
Go Version: 1.22.2
Operating System: Debian GNU/Linux 12 (bookworm) Kernel: 6.1.0-20-amd64, Architecture: x86-64

[Feature Request] Provide timestamp to replication sequence number function

This enables us to reimplement this script in go. Your library is a great candidate to use when working with osm data.

https://github.com/openstreetmap/osm2pgsql/blob/master/scripts/osm2pgsql-replication

How to write PBF or XML data?

I want to duplicate some OSM nodes in my local environment. I'm able to find these nodes. Example:

node/42358880:6
map[string]string{
  "expected_rhn_route_relations":"3",
  "expected_rwn_route_relations":"3",
  "network:type":"node_network",
  "rhn_ref":"62",
  "rwn_ref":"21"
}

I want to write a similar node with a new ID into a PBF or XML file (for later merge). How to achieve this with this library?

Problem in unmarshalling the json file

Hello, I have a problem in unmarshalling one of the files that I previously entered in the form of xml in my program.
When I try to unmarshal an element that has tag:type for example :
"id": 13190773,
"tags": {
"type": "route",
}
I countered this error : unknown type of 'route'
Apparently there is a problem in recognizing types
Thanks in advance

FEATURE REQUEST: Write to osm.pbf

It'd be great to be able to write back into osm.pbf - this is mainly to be able edit data, filter data or simply create diffs between two osm.pbf files. Thanks