Git Product home page Git Product logo

chf-toporesolver's Introduction

CHF-TopoResolver

This repository is the implemenation of the paper: "A Coherent Unsupervised Model for Toponym Resolution".

Installation

You need Java 8 (or higher) to use the toponym resolver. All library dependencies are specified in a Maven pom file.

The GeoNames data are stored in SQLite and Redis.

We provided a script, written in Python 3, for Linux machines to lay the groundwork. For a quick start, just run

   python3 install.py

The script downloads GeoNames data, Redis and Apache Maven. After extracting the downloaded files, it prepares a Redis instance by compiling and running it on port 6384 by default.

Once done, the installer runs an importer using Maven to build the SQLite database and initiate the Redis keys. The whole process takes less than 5 minutes to install the requirements and roughly 30 minutes to import the databases (Tested on Ubuntu 14.04 with 4 CPU-cores and 8GB memory).

If you do not need a new Redis instance, specify only the host and port of the instance using the following command:

   python3 install.py --no_redis --redis_port <PORT> --redis_host <HOST> 

This will bypass the Redis installation part.

Here is more details about the arguments for the install script:

    --no_redis        In case no Redis installation is need (not recommended)
    --redis_port      Redis port (default: 6384)
    --redis_host      Redis host (default: localhost)
    --redis_url       Redis URL to download
    --geonames_url    GeoNames data URL to download
    --maven_url       Apache Maven URL to download

The above URLs are provided by default. If the links were broken, you can pass new URLs using the above arguments.

Getting Started

You can create a GeoTagger instance for toponym recognition and resolution. Here is the simplest way to extract toponyms:

GeoTagger geoTagger = new GeoTagger();
List<Toponym> toponyms = geoTagger.extractToponyms("");
for (Toponym toponym : toponyms)
	System.out.printf("%s located at (%.2f, %.2f)\n", toponym.getPhrase(), );

By default, the Context Hierarchy Fusion (CHF) method is employed to resolve toponyms (Refer to the paper for more details).

Reference

If you found the code useful, please cite the following paper:

@inproceedings{Kamalloo2018,
 author = {Kamalloo, Ehsan and Rafiei, Davood},
 title = {A Coherent Unsupervised Model for Toponym Resolution},
 booktitle = {Proceedings of the 2018 World Wide Web Conference},
 series = {WWW '18},
 year = {2018},
 isbn = {978-1-4503-5639-8},
 location = {Lyon, France},
 pages = {1287--1296},
 numpages = {10},
 url = {https://doi.org/10.1145/3178876.3186027},
 doi = {10.1145/3178876.3186027},
 acmid = {3186027},
 publisher = {International World Wide Web Conferences Steering Committee},
 address = {Republic and Canton of Geneva, Switzerland},
 keywords = {context-bound hypotheses, geolocation extraction, spatial hierarchies, toponym resolution, unsupervised disambiguation},
}

chf-toporesolver's People

Contributors

ehsk avatar nouhadziri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.