Git Product home page Git Product logo

config's Introduction

A modular, open-source search engine for our world.

Pelias is a geocoder powered completely by open data, available freely to everyone.

Local Installation · Cloud Webservice · Documentation · Community Chat

What is Pelias?
Pelias is a search engine for places worldwide, powered by open data. It turns addresses and place names into geographic coordinates, and turns geographic coordinates into places and addresses. With Pelias, you’re able to turn your users’ place searches into actionable geodata and transform your geodata into real places.

We think open data, open source, and open strategy win over proprietary solutions at any part of the stack and we want to ensure the services we offer are in line with that vision. We believe that an open geocoder improves over the long-term only if the community can incorporate truly representative local knowledge.

Pelias

A modular, open-source geocoder built on top of Elasticsearch for fast and accurate global search.

What's a geocoder do anyway?

Geocoding is the process of taking input text, such as an address or the name of a place, and returning a latitude/longitude location on the Earth's surface for that place.

geocode

... and a reverse geocoder, what's that?

Reverse geocoding is the opposite: returning a list of places near a given latitude/longitude point.

reverse

What are the most interesting features of Pelias?

  • Completely open-source and MIT licensed
  • A powerful data import architecture: Pelias supports many open-data projects out of the box but also works great with private data
  • Support for searching and displaying results in many languages
  • Fast and accurate autocomplete for user-facing geocoding
  • Support for many result types: addresses, venues, cities, countries, and more
  • Modular design, so you don't need to be an expert in everything to make changes
  • Easy installation with minimal external dependencies

What are the main goals of the Pelias project?

  • Provide accurate search results
  • Work equally well for a small city and the entire planet
  • Be highly configurable, so different use cases can be handled easily and efficiently
  • Provide a friendly, welcoming, helpful community that takes input from people all over the world

Where did Pelias come from?

Pelias was created in 2014 as an early project at Mapzen. After Mapzen's shutdown in 2017, Pelias is now part of the Linux Foundation.

How does it work?

Magic! (Just kidding) Like any geocoder, Pelias combines full text search techniques with knowledge of geography to quickly search over many millions of records, each representing some sort of location on Earth.

The Pelias architecture has three main components and several smaller pieces.

A diagram of the Pelias architecture.

Data importers

The importers filter, normalize, and ingest geographic datasets into the Pelias database. Currently there are six officially supported importers:

We are always discussing supporting additional datasets. Pelias users can also write their own importers, for example to import proprietary data into your own instance of Pelias.

Database

The underlying datastore that does most of the query heavy-lifting and powers our search results. We use Elasticsearch. Currently versions 7 and 8 are supported.

We've built a tool called pelias-schema that sets up Elasticsearch indices properly for Pelias.

Frontend services

This is where the actual geocoding process happens, and includes the components that users interact with when performing geocoding queries. The services are:

  • API: The API service defines the Pelias API, and talks to Elasticsearch or other services as needed to perform queries.
  • Placeholder: A service built specifically to capture the relationship between administrative areas (a catch-all term meaning anything like a city, state, country, etc). Elasticsearch does not handle relational data very well, so we built Placeholder specifically to manage this piece.
  • PIP: For reverse geocoding, it's important to be able to perform point-in-polygon(PIP) calculations quickly. The PIP service is is very good at quickly determining which admin area polygons a given point lies in.
  • Libpostal: Pelias uses the libpostal project for parsing addresses using the power of machine learning. We use a Go service built by the Who's on First team to make this happen quickly and efficiently.
  • Interpolation: This service knows all about addresses and streets. With that knowledge, it is able to supplement the known addresses that are stored directly in Elasticsearch and return fairly accurate estimated address results for many more queries than would otherwise be possible.

Dependencies

These are software projects that are not used directly but are used by other components of Pelias.

There are lots of these, but here are some important ones:

  • model: provide a single library for creating documents that fit the Pelias Elasticsearch schema. This is a core component of our flexible importer architecture
  • wof-admin-lookup: A library for performing administrative lookup using point-in-polygon math. Previously included in each of the importers but now only used by the PIP service.
  • query: This is where most of our actual Elasticsearch query generation happens.
  • config: Pelias is very configurable, and all of it is driven from a single JSON file which we call pelias.json. This package provides a library for reading, validating, and working with this configuration. It is used by almost every other Pelias component
  • dbclient: A Node.js stream library for quickly and efficiently importing records into Elasticsearch

Helpful tools

Finally, while not part of Pelias proper, we have built several useful tools for working with and testing Pelias

Notable examples include:

  • acceptance-tests: A Node.js command line tool for testing a full planet build of Pelias and ensuring everything works. Familiarity with this tool is very important for ensuring Pelias is working. It supports all Pelias features and has special facilities for testing autocomplete queries.
  • compare: A web-based tool for comparing different instances of Pelias (for example a production and staging environment). We have a reference instance at pelias.github.io/compare/
  • dashboard: Another web-based tool for providing statistics about the contents of a Pelias Elasticsearch index such as import speed, number of total records, and a breakdown of records of various types.

Documentation

The main documentation lives in the pelias/documentation repository.

Additionally, the README file in each of the component repositories listed above provides more detail on that piece.

Here's an example API response for a reverse geocoding query
$ curl -s "search.mapzen.com/v1/reverse?size=1&point.lat=40.74358294846026&point.lon=-73.99047374725342&api_key={YOUR_API_KEY}" | json
{
    "geocoding": {
        "attribution": "https://search.mapzen.com/v1/attribution",
        "engine": {
            "author": "Mapzen",
            "name": "Pelias",
            "version": "1.0"
        },
        "query": {
            "boundary.circle.lat": 40.74358294846026,
            "boundary.circle.lon": -73.99047374725342,
            "boundary.circle.radius": 500,
            "point.lat": 40.74358294846026,
            "point.lon": -73.99047374725342,
            "private": false,
            "querySize": 1,
            "size": 1
        },
        "timestamp": 1460736907438,
        "version": "0.1"
    },
    "type": "FeatureCollection",
    "features": [
        {
            "geometry": {
                "coordinates": [
                    -73.99051,
                    40.74361
                ],
                "type": "Point"
            },
            "properties": {
                "borough": "Manhattan",
                "borough_gid": "whosonfirst:borough:421205771",
                "confidence": 0.9,
                "country": "United States",
                "country_a": "USA",
                "country_gid": "whosonfirst:country:85633793",
                "county": "New York County",
                "county_gid": "whosonfirst:county:102081863",
                "distance": 0.004,
                "gid": "geonames:venue:9851011",
                "id": "9851011",
                "label": "Arlington, Manhattan, NY, USA",
                "layer": "venue",
                "locality": "New York",
                "locality_gid": "whosonfirst:locality:85977539",
                "name": "Arlington",
                "neighbourhood": "Flatiron District",
                "neighbourhood_gid": "whosonfirst:neighbourhood:85869245",
                "region": "New York",
                "region_a": "NY",
                "region_gid": "whosonfirst:region:85688543",
                "source": "geonames"
            },
            "type": "Feature"
        }
    ],
    "bbox": [
        -73.99051,
        40.74361,
        -73.99051,
        40.74361
    ]
}

How can I install my own instance of Pelias?

To try out Pelias quickly, use our Docker setup. It uses Docker and docker-compose to allow you to quickly set up a Pelias instance for a small area (by default Portland, Oregon) in under 30 minutes.

Do you offer a free geocoding API?

You can sign up for a trial API key at Geocode Earth. A commercial service has been operated by the core development team behind Pelias since 2014 (previously at search.mapzen.com). Discounts and free plans are available for free and open-source software projects.

What's it built with?

Pelias itself (the import pipelines and API) is written in Node.js, which makes it highly accessible for other developers and performant under heavy I/O. It aims to be modular and is distributed across a number of Node packages, each with its own repository under the Pelias GitHub organization.

For a select few components that have performance requirements that Node.js cannot meet, we prefer to write things in Go. A good example of this is the pbf2json tool that quickly converts OSM PBF files to JSON for our OSM importer.

Elasticsearch is our datastore of choice because of its unparalleled full text search functionality, scalability, and sufficiently robust geospatial support.

Contributing

Gitter

We built Pelias as an open source project not just because we believe that users should be able to view and play with the source code of tools they use, but to get the community involved in the project itself.

Especially with a geocoder with global coverage, it's just not possible for a small team to do it alone. We need you.

Anything that we can do to make contributing easier, we want to know about. Feel free to reach out to us via Github, Gitter, email, or Twitter. We'd love to help people get started working on Pelias, especially if you're new to open source or programming in general.

We have a list of Good First Issues for new contributors.

Both this meta-repo and the API service repo are worth looking at, as they're where most issues live. We also welcome reporting issues or suggesting improvements to our documentation.

The current Pelias team can be found on Github as missinglink and orangejulius.

Members emeritus include:

config's People

Contributors

bradh avatar dianashk avatar echelon9 avatar greenkeeper[bot] avatar greenkeeperio-bot avatar hkrishna avatar jeremy-rutman avatar joxit avatar mansoor-sajjad avatar michaelkirk avatar missinglink avatar orangejulius avatar riordan avatar sevko avatar tigerlily-he avatar trescube avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

config's Issues

Version 10 of node.js has been released

Version 10 of Node.js (code name Dubnium) has been released! 🎊

To see what happens to your code in Node.js 10, Greenkeeper has created a branch with the following changes:

  • Added the new Node.js version to your .travis.yml
  • The new Node.js version is in-range for the engines in 1 of your package.json files, so that was left alone

If you’re interested in upgrading this repo to Node.js 10, you can open a PR with these changes. Please note that this issue is just intended as a friendly reminder and the PR as a possible starting point for getting your code running on Node.js 10.

More information on this issue

Greenkeeper has checked the engines key in any package.json file, the .nvmrc file, and the .travis.yml file, if present.

  • engines was only updated if it defined a single version, not a range.
  • .nvmrc was updated to Node.js 10
  • .travis.yml was only changed if there was a root-level node_js that didn’t already include Node.js 10, such as node or lts/*. In this case, the new version was appended to the list. We didn’t touch job or matrix configurations because these tend to be quite specific and complex, and it’s difficult to infer what the intentions were.

For many simpler .travis.yml configurations, this PR should suffice as-is, but depending on what you’re doing it may require additional work or may not be applicable at all. We’re also aware that you may have good reasons to not update to Node.js 10, which is why this was sent as an issue and not a pull request. Feel free to delete it without comment, I’m a humble robot and won’t feel rejected 🤖


FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

fail on npm install pelias-config

I am totally lost at the beginning. I am try to follow readme to create ~/pelias.json. I failed on the first step.this is my step
$ git clone https://github.com/pelias/config.git
$ cd config
$ npm install pelias-config
then i get npm error as following:
npm ERR! code ENOSELF
npm ERR! Refusing to install package with name "pelias-config" under a package
npm ERR! also called "pelias-config". Did you name your project the same
npm ERR! as the dependency you're installing?
npm ERR!
npm ERR! For more information, see:
npm ERR! https://docs.npmjs.com/cli/install#limitations-of-npms-install-algorithm

npm ERR! A complete log of this run can be found in:
npm ERR! /home/hupo/.npm/_logs/2018-01-18T06_55_54_793Z-debug.log

anyone can tell me what to do?

An in-range update of tap-spec is breaking the build 🚨

Version 4.1.2 of tap-spec was just published.

Branch Build failing 🚨
Dependency tap-spec
Current Version 4.1.1
Type devDependency

This version is covered by your current version range and after updating it in your project the build failed.

tap-spec is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • continuous-integration/travis-ci/push The Travis CI build could not complete due to an error Details

Commits

The new version differs by 6 commits.

  • 0b3f873 Release 4.1.2
  • 810d5ae Merge pull request #60 from maxlutay/check-asserts-length
  • 484f694 also check for asserts.length
  • 90cb1b8 Merge pull request #51 from vassiliy/feat/mocha_pending_appearance
  • 9160ea1 Format pending specs the same way as Mocha
  • 8ce610d Create LICENSE

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

provide a lint tool

I've seen users post their pelias.json file in comments and there are fields I've never heard of before, also some settings can be old an deprecated.

We could ship a 'lint tool' as part of this repo which would scan the users local pelias.json and alert the user of errors, warnings etc.

We could then advise users to run the linter periodically to ensure their config is up-to-date and valid.

remove Joi?

I noticed that removing this line doesn't seem to affect the the tests or functionality of this module?

I'm not super familiar with Joi so I wasn't able to establish if it's safe to remove it?
There doesn't seem to be any Joi config in this module.

@orangejulius do you remember how this works?

README does not match code behavior

See this: 5646ef6

Specifically the change that reads:

generate: generate.bind( null, true )

and from the README:

Note: by default the merge is shallow

Elasticsearch 5.x for example does not support the index_concurrency setting, so it would be nice to at least have the option to do a shallow merge. bind removes that option, so now, no matter what, the exported generate will always perform a deep merge.

Default whosonfirst.importPostalcodes to true?

This option has been something we've had enabled in Mapzen Search for quite some time, and feels pretty solid. Postalcodes are not that much data (3 million records), and many people like to search on them.

It might be reasonable to consider switching them on by default.

Thoughts on using nconf

Have we thought about using something like nconf for this? I really like being able to override config options from the command line or ENV vars when testing and whatnot. This is obviously very low priority, but was just curious. It also allows you to merge various config files, as we are doing here already.

Invalid apiVersion "1.7"

it appears that newer versions use "1.x" instead of "1.7".

version number here needs to be changed and published.

    x TypeError: Invalid apiVersion "1.7", expected a function or one of master, 1.x, 1.3, 1.2, 1.1, 1.0, 0.90
    x TypeError: Invalid apiVersion "1.7", expected a function or one of master, 1.x, 1.3, 1.2, 1.1, 1.0, 0.90

I wished it was easier to set hostname of ES...

I can't really find an easy solution to my scenario. I'm planning on using Pelias with an external Elastic Search instance provisioned by AWS, and pelias running from a docker instance. Normally this would be as easy as setting the hostname as a environment variable. But with current config the hostname is only defined in a json which doesn't leave many options out:

  • pointing to a different .json file at run time: not useful on scenarios where you have ES being provisioned automatically by Cloudformation and you don't have the hostname upon deployment
  • using add-host with a fixed name not friendly since requires an IP rather than an host name.
  • only solution seems to be patching the json at run time replacing some "tag" string.

My proposed solution would be to have the configuration have a search sequence from .json first, and environment variables second, to allow this configuration in those scenarios. Any thoughts for or against ?

How to disable data sources?

Is it possible to choose only one datasource? When I edit the pelias.json setting just the OSM importer it stills try to import data from the Who's on First.

Elasticsearch 5.x compatibility

So I have managed to get pelias working with es 5 by removing the following parameters:

index_concurrency and refresh_interval from settings.index
Everything but 'type': 'geo_point' from the centroid.js partial (this is in the schema repo)

Check it out if you will.

Improve errors when JSON is invalid

Currently any issues with the pelias.json file cause the config code to output a very unclear error message that looks like this:

throw new Error( 'failed to merge config from path:' + path );

It doesn't give any indication that the problem is with the config file's syntax and definitely doesn't provide hints at the part of the file that has the syntax issue. There are many ways to validate JSON, surely we can bring one in and use it to create friendly error messages.

An in-range update of jshint is breaking the build 🚨

Version 2.9.6 of jshint was just published.

Branch Build failing 🚨
Dependency jshint
Current Version 2.9.5
Type devDependency

This version is covered by your current version range and after updating it in your project the build failed.

jshint is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Release Notes JSHint 2.9.6

2.9.6 (2018-07-30)

Bug Fixes

  • Add missing global objects for browser env (badc7a4)
  • Add other Fetch spec globals (07bb596), closes #2582
  • Allow closing over immutable bindings (7091685)
  • Allow computed method names in obj literal (a5ff715)
  • Allow empty export and trailing comma (631327e), closes #2567
  • Avoid infinite loop on invalid for stmt (56a4379)
  • Consistently ignore dot-prefixed dirs (8d4317e)
  • Correct impl of built-in bindings (a11d631)
  • Correct interpretation of whitespace (dd06eea)
  • Correct location of reported error (1c434a3)
  • Correct location reported for W043 (1d04868)
  • Correct reporting of var name in list comprehensions (0ff6644)
  • Correct restriction on function name (55aa54e)
  • Correct spelling of Uint8ClampedArray (8df4a32)
  • Create block scope for switch statements (aa2be10)
  • Disallow default values in rest parameters (b420aed)
  • Do not create binding for illegal syntax (9fe8c94)
  • Do not warn about non-ambiguous linebreaks (ab3ab85)
  • Fix "is is" message typos (7993101)
  • Preserve functionality in "legacy" Node.js (2f6ac13)
  • recognize Jasmine global spyOnProperty (827237f), closes #3183
  • Relax restriction on asgnmnt to arguments (0a66710)
  • Remove warning W100 (ff71d3c)
  • Report error for duplicate arrow params (506c7d5)
  • Report error for redeclared generator fns (8896fa3)
  • Restrict "name" of strict mode functions (a554c89)
  • Restrict super usage to valid forms (8f3f880)
  • Restrict IdentifierNames in ES5 code (5995a9f)
  • Tolerate division following closing brace (3aa02db)
  • Tolerate RegExp as void operand (3f920b5)
  • Tolerate whitespace in inline directives (efeb0f8)

Features

  • List outer scoped variables of W083 (d03662c), closes #3211
Commits

The new version differs by 113 commits.

  • d5c1a00 v2.9.6
  • ab3ab85 [[FIX]] Do not warn about non-ambiguous linebreaks
  • eaca85b [[CHORE]] Improve test coverage for ASI warning
  • 0a66710 [[FIX]] Relax restriction on asgnmnt to arguments
  • 3aa02db [[FIX]] Tolerate division following closing brace
  • 55aa54e [[FIX]] Correct restriction on function name
  • ff71d3c [[FIX]] Remove warning W100
  • bcb3b23 [[CHORE]] Complete Lodash update (#3283)
  • 030713d [[DOCS]] Introduce administration e-mail address
  • 7993101 [[FIX]] Fix "is is" message typos
  • 578575d Merge pull request #3254 from mathiasbynens/unicode-10
  • d763e70 Use old Unicode version for ES5 identifiers
  • 77414e8 Update to Unicode v11
  • 5995a9f [[FIX]] Restrict IdentifierNames in ES5 code
  • f2ce8fe [[TEST]] Add regression test

There are 113 commits in total.

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

Create an example starter pelias.json

Currently, our README here advises people to copy our full default.json file and use it as a starter pelias.json. This is not really good because it locks people into all the defaults, and forces them to start with a super long config.

While it would be a little more work, ideally we would have a "starter" pelias.json explicitly for people to copy/paste as their initial pelias.json. This would only have a general framework of the full config and only have parameters that we expect people to override (like paths, etc). This would ensure that as we change the defaults, they will be propagated to most user's configs automatically over time.

An in-range update of joi is breaking the build 🚨

Version 13.5.0 of joi was just published.

Branch Build failing 🚨
Dependency joi
Current Version 13.4.0
Type dependency

This version is covered by your current version range and after updating it in your project the build failed.

joi is a direct dependency of this project, and it is very likely causing it to break. If other packages depend on yours, this update is probably also breaking those in turn.

Status Details
  • continuous-integration/travis-ci/push: The Travis CI build could not complete due to an error (Details).

Commits

The new version differs by 21 commits.

  • 63492d4 13.5.0
  • 334c1e3 Cleanup for #1532.
  • 3372df0 Merge pull request #1532 from rokoroku/patch-1
  • 3414eb7 Update documentation for string.trim([enabled])
  • 0a82b61 Add assertion for string.trim()
  • bcc5f12 Cleanup for #1510.
  • 8b39221 Merge pull request #1510 from Shudrum/dataUri
  • 2391f72 Cleanup for #1487.
  • 7aa0df0 Merge pull request #1487 from BolajiOlajide/ft-allow-square-brackets-param-url-validator
  • 37d3588 Add createError documentation. Fixes #999.
  • 77012b2 Add enabled flag to string.trim()
  • 8eefd0d Don't initialize options uselessly
  • 52fd99b Padding option added to dataUri like base64
  • 840eaad Move the dataUri tests after the base64 one
  • 83eb8eb Merge pull request #1511 from WesTyler/unique_ignoreUndefined_#1498

There are 21 commits in total.

See the full diff

FAQ and help

There is a collection of frequently asked questions. If those don’t help, you can always ask the humans behind Greenkeeper.


Your Greenkeeper Bot 🌴

config in YML format

good idea considering the size of config in pelias.json file!
I think that can be simply refact

Provide a simple API for retrieving properties from config

Code which consumes config has the responsibility of first checking properties exist in order to avoid crashes/errors.

Some script use lodash to do things like _.isEmpty(config.imports.openaddresses.files)

Other script are much more verbose which a lot of vanillajs ™️ if/else checks such as:

  // check pelias-config for a list of blacklist files to load
  const settings = config.generate();

  // config does not contain the relevant properties
  // return a no-op passthrough stream
  if( !settings.imports || !settings.imports.blacklist ){
    return through.obj();
  }

  // config does not contain a valid list of files
  // return a no-op passthrough stream
  const bl = settings.imports.blacklist;
  if( !Array.isArray( bl.files ) || bl.files.length === 0 ){
    return through.obj();
  }

What would be cool is if pelias/config offered an API which could be used to do checks similar to _.isEmpty and retrieve properties as per _.get.

It would probably be easiest to just include lodash in this repo so we don't have to reinvent the wheel.

It would then be nice to go through our repos and clean up the if/else checks and make them prettier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.