Git Product home page Git Product logo

lstm-trajgan's Introduction

LSTM-TrajGAN

LSTM-TrajGAN: A Deep Learning Approach to Trajectory Generation and Privacy Protection

Abstract

The prevalence of location-based services contributes to the explosive growth of individual-level location trajectory data and raises public concerns about privacy issues. In this research, we propose a novel LSTM-TrajGAN approach, which is an end-to-end deep learning model to generate privacy-preserving synthetic trajectory data for data sharing and publication. We design a loss metric function TrajLoss to measure the trajectory similarity losses for model training and optimization. The model is evaluated on the trajectory-user-linking task on a real-world semantic trajectory dataset. Compared with other common geomasking methods, our model can better prevent users from being re-identified, and it also preserves essential spatial, temporal, and thematic characteristics of the real trajectory data. The model better balances the effectiveness of trajectory privacy protection and the utility for spatial and temporal analyses, which offers new insights into the GeoAI-powered privacy protection for human mobility studies.

workflow

trajectory_example

Reference

If you find our code or ideas useful for your research, please cite our paper:

Rao, J., Gao, S.*, Kang, Y. and Huang, Q. (2020). LSTM-TrajGAN: A Deep Learning Approach to Trajectory Privacy Protection. In the Proceedings of the 11th International Conference on Geographic Information Science (GIScience 2021), 12:1--12:17.

@InProceedings{rao_et_al:LIPIcs:2020:13047,
  author =	{Jinmeng Rao and Song Gao and Yuhao Kang and Qunying Huang},
  title =	{{LSTM-TrajGAN: A Deep Learning Approach to Trajectory Privacy Protection}},
  booktitle =	{11th International Conference on Geographic Information Science (GIScience 2021) - Part I},
  pages =	{12:1--12:17},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-166-5},
  ISSN =	{1868-8969},
  year =	{2020},
  volume =	{177},
  editor =	{Krzysztof Janowicz and Judith A. Verstegen},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2020/13047},
  URN =		{urn:nbn:de:0030-drops-130471},
  doi =		{10.4230/LIPIcs.GIScience.2021.I.12},
  annote =	{Keywords: GeoAI, Deep Learning, Trajectory Privacy, Generative Adversarial Networks}
}

Requirements

LSTM-TrajGAN uses the following packages with Python 3.6.3

  • numpy==1.18.4
  • pandas==1.1.5
  • tensorflow-gpu==1.13.1
  • Keras==2.2.4
  • geohash2==1.1
  • scikit-learn==0.23.2

Usage

Data Encoding

Trajectory_Point_Encoding

Convert csv files to one-hot-encoded npy files.

python data/csv2npy.py --load_path dev_train_encoded_final.csv --save_path train_encoded.npy --tid_col tid

Where load_path is the path to csv file, save_path is the path to save npy file, tid_col is the column name of trajectory id.

Training

Train the LSTM-TrajGAN model using the preprocessed data.

python train.py 2000 256 100

Where 2000 is the total training epochs, 256 is the batch size, 100 is the parameter saving interval (i.e., save params every 100 epochs).

Prediction

Generate synthetic trajectory data based on the real test trajectory data and save them to results/syn_traj_test.csv.

python predict.py 1900

Where 1900 means we load the params file saved at the 1900th epoch to generate synthetic trajectory data.

Test

Evaluate the synthetic trajectory data on the Trajectory-User Linking task using MARC.

python TUL_test.py data/train_latlon.csv results/syn_traj_test.csv 100

Where data/train_latlon.csv is the training data, results/syn_traj_test.csv is the synthetic test data, 100 is the embedder size.

Dataset

The data we used in our paper originally come from the Foursquare NYC check-in dataset.

References

We mainly referred to these two works:

May Petry, L., Leite Da Silva, C., Esuli, A., Renso, C., and Bogorny, V. (2020). MARC: a robust method for multiple-aspect trajectory classification via space, time, and semantic embeddings. International Journal of Geographical Information Science, 34(7), 1428-1450. Github

Keras-GAN: Collection of Keras implementations of Generative Adversarial Networks (GANs). Github

lstm-trajgan's People

Contributors

gissong avatar phantask avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lstm-trajgan's Issues

I get this error when I run this line of code:python3 TUL_test.py data/train_latlon.csv results/syn_traj_test.csv 100

Traceback (most recent call last):
  File "TUL_test1.py", line 305, in <module>
    pred_y_test = np.array(classifier2.predict(cls_x_test))
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[28,126] = 16 is not in [0, 10)
	 [[{{node emb_category_1/embedding_lookup}}]]

Code problem

When I read your code, I noticed that you did not publish the preprocessing for the real track. Later in the paper, I saw that spatial embedding is realized by MLPs. But I still do not know how to deal with, I would like to have a look at the source code you wrote. I hope to get your help. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.