Git Product home page Git Product logo

tilenol's People

Contributors

brian-dellabetta avatar ccma14 avatar dependabot[bot] avatar jerluc avatar madhulikajc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tilenol's Issues

[RFC] Allow for partial responses

A few scenarios have come up recently (e.g. #16) that brings up an important decision we need to make regarding the resiliency of the tilenol API, and whether or not to have a "best-effort" policy:

  1. Should we return any features in a layer when one fails to be retrieved/encoded?
  2. Should we return any layers in a tile when on fails to be retrieved/encoded?

Tilenol crashes with nil pointer exception

Describe the bug
Tilenol crashes upon nil exception, however this bug is unpredictably reproducible.
The main problem with this bug is the uncertainty of how and when it will occur and that it causes tilenol to stop running entirely.

To Reproduce
Line run right prior to error (in tilenol/examples/postgis):

../../target/tilenol run -x -d

When rendering the locally hosted tilenol via a local html file, tilenol works as expected, but at some point in time crashes with nil pointer dereference.

Hypothesized causes:

  • Idle session timeout
  • Refreshing page
  • Panning across tiles too rapidly
  • Underlying TxDatabase libraries being source of error

Stack trace:

...
INFO[0018] [...] "GET http://localhost:3000/list-locations/11/620/758.mvt?q=list_id%20%3D%20%2701EG4NCCHP5PJF4NHE2P4MV7V8%27 HTTP/1.1" from [::1]:58250 - 200 47B in 10.037610166s
panic: runtime error: invalid memory address or nil pointer dereference
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1569646]

goroutine 947 [running]:
github.com/doug-martin/goqu/v9.(*TxDatabase).Trace(0x0, 0x18f5626, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0)
	/.../go/pkg/mod/github.com/doug-martin/goqu/[email protected]/database.go:493 +0x26
github.com/doug-martin/goqu/v9.(*TxDatabase).Rollback(0x0, 0x9, 0x0)
	/.../go/pkg/mod/github.com/doug-martin/goqu/[email protected]/database.go:605 +0x56
panic(0x1806040, 0x1e601d0)
	/usr/local/go/src/runtime/panic.go:969 +0x1b9
github.com/doug-martin/goqu/v9.(*TxDatabase).Trace(0x0, 0x18f2154, 0x5, 0xc000554000, 0x6c8, 0x0, 0x0, 0x0)
	/.../go/pkg/mod/github.com/doug-martin/goqu/[email protected]/database.go:493 +0x26
github.com/doug-martin/goqu/v9.(*TxDatabase).QueryContext(0x0, 0x19d53a0, 0xc0000b4008, 0xc000554000, 0x6c8, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/.../go/pkg/mod/github.com/doug-martin/goqu/[email protected]/database.go:535 +0x85
github.com/doug-martin/goqu/v9.(*TxDatabase).Query(...)
	/.../go/pkg/mod/github.com/doug-martin/goqu/[email protected]/database.go:530
github.com/stationa/tilenol.(*PostGISSource).runQuery(0xc0001cfa70, 0x19d5360, 0xc00020e600, 0xc000554000, 0x6c8, 0x0, 0x0, 0x0, 0x0, 0x0)
	/.../dev/src/github.com/stationa/tilenol/postgis_source.go:206 +0x21a
github.com/stationa/tilenol.(*PostGISSource).GetFeatures(0xc0001cfa70, 0x19d5360, 0xc00020e600, 0xc00060c060, 0xc00027ff28, 0x4, 0x4)
	/.../dev/src/github.com/stationa/tilenol/postgis_source.go:250 +0x325
github.com/stationa/tilenol.(*Server).getVectorTile.func1(0x0, 0x0)
	/.../dev/src/github.com/stationa/tilenol/server.go:270 +0x223
golang.org/x/sync/errgroup.(*Group).Go.func1(0xc0004f4240, 0xc00041c0a0)
	/.../go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 +0x59
created by golang.org/x/sync/errgroup.(*Group).Go
	/.../go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:54 +0x66

Note: When running the following script to re-run all the requests from the stack trace above (sequentially), there were no errors.

cat tilenol_logs.txt | grep OPTIONS | cut -d" " -f 4 | xargs curl --compressed -v

Expected behavior
Tilenol runs without crashing.

Desktop (please complete the following information):

  • OS: macOS Big Sur
  • Browser: Chrome
  • Version: 11.1

Upgrade Elasticsearch client libraries

Currently, we rely on olivere/elastic for interacting with backend Elasticsearch clusters, but this library has been deprecated by its maintainers for about a year now; instead, they recommend we use the official Go client.

As a further benefit to this upgrade, we may get better compatibility for supporting multiple
Elasticsearch versions (possibly related to #51).

Tilenol should verify arguments and return a BadRequest error rather than an empty response

Description

Requesting a map tile form tilenol with invalid arguments currently causes a response with an HTTP result code 200 (OK) to be returned, but an empty response. Instead an HTTP error code 400 (BadRequest) should be returned in such a case.

Example:

Requesting a URL such as

http://tilenol.foo.com/_all/1000/1000/1000.mvt

causes such an empty response.

tilenol has a a max. zoom level specified in its configuration. Also the number of tiles depends on the zoom level specified, as outlined here.

Add Docker image usage documentation

Currently, the README only spells out how to install/use tilenol from source, but ideally most users would only be using tagged binaries. IMO we should update our README to:

  • Add a new "Docker" section to "Usage"
  • Add descriptions of Docker image tag conventions (e.g. devel vs. latest)
  • Add actual usage examples for the Docker image
  • Add Docker Hub link(s): https://hub.docker.com/repository/docker/stationa/tilenol
    • Update Docker Hub's "Readme" section with image usage

Moved the below to #32:

Optionally, we could also go a bit further by:

  • Connecting Docker Hub and Github to sync Docker Hub's "Readme" section
  • Use Github's new "package" feature to publish Docker images
  • Implement #32 to auto-publish new images and sync to our repo somehow

Layer configuration should have some semantic checks for min and max zooms

Describe the bug
When layers are configured in Tilenol, we should do some simple semantic checks to at least ensure that the layers are actually accessible (likely around here somewhere), e.g. checking that the min and max zoom numbers fall within the absolute min and max zooms acceptable by the server itself.

To Reproduce
Steps to reproduce the behavior:

  1. Configure Tilenol with a layer whose max zoom is insanely large, e.g. 1000
  2. Run the server
  3. Try to request at z = 1000
  4. See that you can't access it

Improve test coverage

Unit test coverage could be greatly improved, and may help restructure the code to be slightly less monolithic.

Example not working for ElasticSearch version 7.4

Only difference i could find is in mapping config.

version 7.1

{
	"buildings": {
		"aliases": {},
		"mappings": {
			"properties": {
				"geometry": {
					"type": "geo_shape"
				},
				"height": {
					"type": "long"
				},
				"name": {
					"type": "text"
				},
				"stories": {
					"type": "integer"
				}
			}
		},
		"settings": {
			"index": {
				"routing": {
					"allocation": {
						"include": {
							"_tier_preference": "data_content"
						}
					}
				},
				"number_of_shards": "1",
				"provided_name": "buildings",
				"creation_date": "1629479814250",
				"number_of_replicas": "1",
				"uuid": "Nq2jkkSFQbOfzRWZtfHu-Q",
				"version": {
					"created": "7140099"
				}
			}
		}
	}
}

Version 6.5.1 has extra _doc which is deprecated on new 7.x

{
	"buildings": {
		"aliases": {},
		"mappings": {
			"_doc": {
				"properties": {
					"geometry": {
						"type": "geo_shape"
					},
					"height": {
						"type": "long"
					},
					"name": {
						"type": "text"
					},
					"stories": {
						"type": "integer"
					}
				}
			}
		},
		"settings": {
			"index": {
				"creation_date": "1629478310030",
				"number_of_shards": "5",
				"number_of_replicas": "1",
				"uuid": "ycsiK_gcRDGt8zZfEjw2mg",
				"version": {
					"created": "6050199"
				},
				"provided_name": "buildings"
			}
		}
	}
}

Is there any workaround for this?

Add support for additional ES layer filtering

We should consider supporting the ability to configure an arbitrary (but fixed?) Elasticsearch query to be appended to the backend tile bounds query, e.g. sub-filtering a tile on a given property.

Add support for ES aggregate layers

Elasticsearch has a ton of useful features for producing aggregations from search queries.

One idea was that we could allow tilenol to take advantage of this functionality by adding a way to configure aggregate queries in ES source configurations.

[RFC] Consider upgrade to Go 1.18 generics

Since Go 1.18, we now finally get generics, among other improvements. This is being written as an RFC since although this change would vastly clean up some of our code (I'm looking at you ES/PostGIS source and Redis cache), it would also make the source code incompatible with prior versions of Go.

[RFC] Consider caching data per-layer rather than per-request

Currently, we cache data using the request path as a key and store the full response body as the value. This has the nice side effect of being very simple to implement and maintain, but comes with its drawbacks:

  1. When a request to /_all/{z}/{x}/{y}.mvt is made, another call to /layer1/{z}/{x}/{y}.mvt will miss the cache, because the cache key is based purely on the request path
  2. When a request to /_all/{z}/{x}/{y}.mvt is made, and a partial failure occurs, not only does the entire request fail, but none of the successful layer responses are cached, meaning a subsequent call would have to recompute the entire response, rather than only the failed responses

To fix these problems, we should consider using something like {layer}/{z}/{x}/{y} as a cache key, and caching individual feature collections per layer response. Then, in the above two scenarios:

  1. When a request to /_all/{z}/{x}/{y}.mvt is made, all layer responses get cached, and another call to /layer1/{z}/{x}/{y}.mvt will hit the cache, because the cache key is based on the layer name
  2. When a request to /_all/{z}/{x}/{y}.mvt is made, and a partial failure occurs, the successful layer responses are cached, meaning a subsequent call would only have to recompute the failed layers

Investigate feasibility of integrating with Sentry

Description

In order to facilitate monitoring of a running tilenol integration, we should investigate the feasibility of integrating with Sentry. This should be optional, such that if tilenol is started with a Sentry DSN specified as an optional parameter then error reporting to Sentry should be enabled.

Add support for PostGIS as a layer source

PostGIS is a pretty common data storage option working with geo data, so it would make sense to support this backend.

At a high level, these are the pieces I'm hoping to implement:

  • PostGISConfig
    • Database connection parameters (i.e. DSN)
    • Database relation spec (i.e. schema + table name)
    • Default source fields or column expression mappings
    • Geometry field or column expression mapping
    • Accept an optional custom query in place of the database relation spec (e.g. to support "view" functionality)
  • PostGISSource
    • Translate tile request into a SQL query
    • Construct a feature collection from the resulting SQL rows
    • Accept s query parameters to specify additional source fields / column expressions
    • Accept q query parameter to specify additional source filtering / WHERE expressions
      • Run runtime check on boot that determines if the user is read-only

Auto-publish new build artifacts

Back in #8 we had decided to defer the automatic publishing of tilenol builds/Docker images to both Github and Dockerhub. Since this is the last remaining manual process in tilenol development, and given the new support available today for automating these tasks (e.g. Travis CI, Github Actions, etc.), I think we should just make this happen to simplify things.

A couple of specifics to work through:

  • Changes to the main branch should be auto-published (for Docker images, this should correspond simply to the devel tag)
  • Tagged builds should be auto-published (for Docker images, the Git tag should correspond to the image tag, e.g. v1.1.0)
  • Forks/PR builds should not be published (even though they already get built and run through tests in Travis CI)

We should also consider:

  • Connecting Docker Hub and Github to sync Docker Hub's "Readme" section
  • Using Github's new "package" feature to publish Docker images

Release v1.0.0

  • Create Git tag
  • Build and publish StationA/tilenol:v1.0.0
  • Build and publish StationA/tilenol:latest
  • Build and publish StationA/tilenol:devel

Support GeoJSON representation

We should consider adding support for non-MVT representations of tile data, e.g. GeoJSON. This might help to improve compatibility with various mapping clients, at the cost of performance on larger datasets.

[Deps] Consider migrating to Golang modules

Context

The Golang dependency story continues to grow more "official" solutions, which means it's time to re-evaluate whether or not dep is still the best available option.

go mod

In the most recent stable Go version (v1.13+), Go modules have finally hit stable and are now the default way that Go resolves $GOPATH calculation. This also provides a basic mechanism for declaring code dependencies and their versions (accessible via the go mod subcommand, or by editing the go.mod file), without the need to vendorize 3rd-party code (as dep and glide both do). This should help in a couple of important ways:

  • No more dependency upgrade changesets that obscure actual 1st-party source changes; with Go modules, a dependency upgrade should only manifest itself as a change to the go.mod file
  • Improved support for alternative source code repository structures (though being honest, I'm quite happy with the "legacy" style file system layout constraints)
  • Improved support for module forks and import paths for forks of dependencies; this has been a long-running problem with vendorizing dependency management systems, as these typically encode the fully-qualified repository name (e.g. github.com/[OWNER]/[REPO]) in the directory structure itself, causing issues when a new [OWNER] forks the repository

Add an option to auto-reload configuration file on change

There are some use cases in which it would make sense for tilenol to be able to auto-reload its configuration.

For example, when tilenol's configuration is mounted as a file from a ConfigMap in a Kubernetes environment, the ConfigMap may change when someone applies a new configuration via kubectl, but tilenol does not update since it only ever reads the configuration from disk on boot. Notably there are some temporary workarounds for this particular scenario, e.g. Reloader, but it might make more sense to simply add a new CLI flag, e.g. --reload, to support automatically watching for configuration file changes.

Consider simplifying shapes in parallel

Currently, after the ElasticSearch documents are deserialized as orb.Geometry's, the features all get compiled into a single layer and bulk simplified. This makes life easier, however, the layer/feature collection bulk simplification process runs serially, which may have some effects on performance.

We should consider instead using the DouglasPeuckerSimplifier.Simplify(orb.Geometry) function to allow us to simplify each geometry potentially in parallel, by using a pool of goroutines.

[RFC] Support for TileJSON as a backend source

tilenol could greatly benefit from the vast majority of tile data servers that are currently unsupported (e.g. Mapbox Tiling Service), and likely to not be supported for a while. In order to provide a simple means of sort of "bridging" this gap, we could potentially implement a new backend data source which allows tilenol to operate as a sort of "frontend" to both raw backend data sources and other tile servers. For example:

layers:
  - name: trees
    source:
      tilejson:
        # The backend TileJSON URL
        url: https://my.tile.server/tilejson
        # The backend layer we want to pull out
        layer: trees
  - name: buildings
    source:
      postgis:
        ...

In this example, tilenol would act as a common "frontend" to both the trees layer (coming from a backend tile server), and the buildings layer (coming directly from PostGIS).

Another interesting thought is that if we were to implement #6 , we could also benefit from the fact that a tilenol server could act as that unified "frontend" for even other tilenol servers, and form a kind of "federated" service.

[RFC] Add support for lightweight post-processing of feature data

I'd like to propose that we provide a mechanism for allowing some amount of post-processing of feature data to support some common tasks, e.g.:

  • Shape simplification or other geometry edits (note that currently simplification is enabled as a server-wide flag with not additional options)
  • Property data edits or potentially conditional adding/removing of data
    • String formatting
    • Numerical conversions
    • Adding style information to feature properties
  • Creating new features based on tile-wide data
    • Aggregated grid cells
    • Tile-local clusters

The reason to consider this a post-processing step is so that a singular representation of this logic can apply to feature data coming from multiple backend sources, especially for the scenario of computing some tile-wide aggregations.

One thought would be to introduce support for some basic, embeddable scripting language (e.g. Lua, JS, etc.) that can be provided either statically in the tilenol configuration, or potentially in the request URL itself (e.g. script=<BLAH>). By using a simple scripting language (especially a familiar one), users would be able to quickly pick up the scripting capabilities. Furthermore, by providing the option to support request-time scripting, this could be used for very powerful, expressive, and interactive geospatial analyses over mixed datasets.

Fix Travis CI configs

The project's Travis CI config should be updated to:

  • Actually build the software
  • Run unit tests
  • Publish built releases to Github (deferred)
  • Build and push new Docker images (deferred)

PostGIS source returns a string-only error on request cancellation

When a request is cancelled for a PostGIS-backed layer, this unexpectedly results in a 500 error being returned:

ERRO[0017] Tile request failed: pq: canceling statement due to user request (HTTP error 500)

Upon further investigation, it appears that the PostgreSQL library we use returns the opaque error string pq: canceling statement due to user request when a request is cancelled (e.g. when panning around quickly on the map), which is incorrectly being handled as an unexpected server error (500 error code).

While this is currently not impacting end use in map interfaces (since when a browser cancels a request, it discards/closes its receive socket), this is clearly bypassing some code that was written specifically to handle request cancellations which in turn may impact:

  • Request teardown/cleanup code
  • Error reporting integration (e.g. #26)

Implement caching support

Add support for an optional caching layer implemented with Redis. This would significantly improve performance by adding server-side caching of map tiles.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.