urbcomp / deeptte Goto Github PK

Python 98.35% Shell 1.65%

deeptte's Introduction

This project is the code of AAAI 2018 paper When Will You Arrive? Estimating Travel Time Based on Deep Neural Networks.

We provide the complete version of code and part of sample data in Chengdu. You can replace the sample data with your own data easily. See the samples in data/ for more details. The complete data can be downloaded at https://duke.box.com/s/ni5ca8iktneq828fk5cul8afwkvszkdr , which is provided by the following competion http://www.dcjingsai.com/common/cmpt/%E4%BA%A4%E9%80%9A%E7%BA%BF%E8%B7%AF%E9%80%9A%E8%BE%BE%E6%97%B6%E9%97%B4%E9%A2%84%E6%B5%8B_%E8%B5%9B%E4%BD%93%E4%B8%8E%E6%95%B0%E6%8D%AE.html.

Usage:

Model Training

python train.py

Parameters:

task: train/test
batch_size: the batch_size to train, default 400
epochs: the epoch to train, default 100
kernel_size: the kernel size of Geo-Conv, only used when the model contains the Geo-conv part
pooling_method: attention/mean
alpha: the weight of combination in multi-task learning
log_file: the path of log file
result_file: the path to save the predict result. By default, this switch is off during the training

Example:

python main.py --task train  --batch_size 10  --result_file ./result/deeptte.res --pooling_method attention --kernel_size 3 --alpha 0.1 --log_file run_log

Model Evaluation

Parameters:

weight_file: the path of model weight
result_file: the path to save the result

Example:

python main.py --task test --weight_file ./saved_weights/weight --batch_size 10  --result_file ./result/deeptte.res --pooling_method attention --kernel_size 3 --alpha 0.1

How to User Your Own Data

In the data folder we provide some sample data. You can use your own data with the corresponding format as in the data samples. The sampled data contains 1800 trajectories. To make the model performance close to our proposed result, make sure your dataset contains more than 5M trajectories.

Format Instructions

Each sample is a json string. The key contains:

driverID
dateID: the date in a month, from 0 to 30
weekID: the day of week, from 0 to 6 (Mon to Sun)
timeID: the ID of the start time (in minute), from 0 to 1439
dist: total distance of the path (KM)
time: total travel time (min), i.e., the ground truth. You can set it as any value during the test phase
lngs: the sequence of longitutes of all sampled GPS points
lats: the sequence of latitudes of all sampled GPS points
states: the sequence of taxi states (available/unavaible). You can remove this attributes if it is not available in your dataset. See models/base/Attr.py for details.
time_gap: the same length as lngs. Each value indicates the time gap from current point to the firt point (set it as arbitrary values during the test)
dist_gap: the same as time_gap

The GPS points in a path should be resampled with nearly equal distance.

Furthermore, repalce the config file according to your own data, including the dist_mean, time_mean, lngs_mean, etc.

deeptte's People

Contributors

Stargazers

Watchers

deeptte's Issues

TypeError: object of type 'map' has no len()

Hi everyone, I faced this errors when ruining the main.py by sample data files [train 00-04 and test], so, please anyone can suggest to me the way of how to fix this issues?. Thanks for all advance.

Traceback (most recent call last):
File "C:/....../DeepTTE-master-2/main.py", line 172, in
run()
File "C:/........./DeepTTE-master-2/main.py", line 168, in run
evaluate(model, elogger, config['test_set'], save_result=True)
File "C:/......./DeepTTE-master-2/main.py", line 119, in evaluate
data_iter = data_loader.get_loader(input_file, args.batch_size)
File "C:.........\DeepTTE-master-2\data_loader.py", line 105, in get_loader
batch_sampler = BatchSampler(dataset, batch_size)
File "C:.........\DeepTTE-master-2\data_loader.py", line 70, in init
self.count = len(dataset)
File "C:..........\DeepTTE-master-2\data_loader.py", line 29, in len
return len(self.content)
TypeError: object of type 'map' has no len()

Two questions about the distance gap

It is a very Nice job!

Two question:

You said that: "The GPS devices usually generate one record for every fixed length time gap ....... To avoid such case, we resample each historical trajectory such that the distance gap between two consecutive points are around 200 to 400 meters."

Additionally, you said that the distance gaps in the testing data are equal.
Are the distance gaps between two consecutive points in the training data equal? or each distance gap in the training data is different and each one is randomly selected from 200 to 400 meters?

You said that: "During the test phase, to make the testing data consistent
with the training data, we convert a path P to a sequence of
location points with equal distance gaps."

How do you convert the path P to a sequence of location points with equal distance gaps? Could you generously publish the code that achieves the function? You encourage users to User Your Own Data, but we can not use our own data if we don't know how to convert the path P to a sequence of location points with equal distance gaps.

Thank you for your answer.

How to get the trajectory from the data download from the website.

I downloaded the data from the website which the author offers us, each record shows us driver_id, latitude, longitude, state and time tap, but I can`t get a trajectory from the records. Can anybody give me some tips?

Normalisation of global and local distances

Thank you for the great work!

I would like to ask why we need to do normalisations of local_dist at models.base.GeoConv.Net.forward() and of dict at models.base.Attr.Net.forward(). Haven't we done so in main.py when we are calling data_loader.get_loader?

Question about the data normaliztion

I found that the 'time_gap' & 'dist_gap' of the sample data are the values compare with the first values. But the static value of these feature in the config.json are calculate by the interval of these values. And you normalize these feature by the values in the config.json.... So why the 'time_gap' and 'dist_gap' generate by the first values not just by the interval of the sampled trajectory point?

Found an element in 'lengths' that is <=0 in evaluation

Hello everyone, I have an issue while running the code: first, it seems that everything is fine and the training is running, loss values are printed. But, after some time, I get this error:

Progress 0.07%, average loss 0.07827574014663696 Traceback (most recent call last):
File "/home/s6haoeze/DeepTTE/main.py", line 329, in
run()
File "/home/s6haoeze/DeepTTE/main.py", line 319, in run
train(model, elogger, train_set = config['train_set'], eval_set = config['eval_set'])
File "/home/s6haoeze/DeepTTE/main.py", line 241, in train
_, loss = model.eval_on_batch(attr, traj, config)
File "/home/s6haoeze/DeepTTE/models/DeepTTE.py", line 137, in eval_on_batch
entire_out, (local_out, local_length) = self(attr, traj, config)
File "/home/s6haoeze/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/s6haoeze/DeepTTE/models/DeepTTE.py", line 124, in forward
sptm_s, sptm_l, sptm_t = self.spatio_temporal(traj, attr_t, config)
File "/home/s6haoeze/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/s6haoeze/DeepTTE/models/base/SpatioTemporal.py", line 92, in forward
packed_inputs = nn.utils.rnn.pack_padded_sequence(conv_locs, lens, batch_first = True)
File "/home/s6haoeze/anaconda3/lib/python3.9/site-packages/torch/nn/utils/rnn.py", line 262, in pack_padded_sequence
_VF._pack_padded_sequence(input, lengths, batch_first)
RuntimeError: Length of all samples has to be greater than 0, but found an element in 'lengths' that is <= 0

Although in my dataset, the length of the trajectories are greater than 18, it seems that somewhere in the code some length is interpreted as shorter than 2 (the line lens = [x - self.kernel_size + 1 for x in traj['lens']] in SpatioTemporal.py). What can be done? Thanks a lot!

Training on data without 'state' attribute

If my dataset doesn't contain the state attribute, how can I still train my model? Attr.py doesn't seem to have any clues. My data doesn't have a 'states' attribute, do I need to add one and set it to blanks? Right now when I try to train the model, I get a KeyError in the data loader. Removing 'states' from the keys to use then yields an index out of range error in Attr.py.

Can you add setup.py?

Would love to use this in a notebook environment via !pip install. Would you be able to a add a setup.py?

About the baselines of DeepTTE

I‘m appreciate if you could provide the baselines of DeepTTE you referred to in the paper, such as MLPTTE, GBDT.
email address:[email protected]
Thank you anyway!

provide the code of baselines

I‘m appreciate if you could provide the work of the baselines you refered in paper.
email address:[email protected]
thankyou anyway!

No run.log File

When I run train.py, it reported a mistake about no run._log.log file? Can you help me solve this problem, thanks.
Error details：
(torch2.0) nlg@ubuntu:/PycharmProj2/DeepTTE-master$ python main.py --task train --batch_size 10 --result_file ./result/deeptte.res --pooling_method attention --kernel_size 3 --alpha 0.1 --log_file run_log
Traceback (most recent call last):
File "main.py", line 161, in
run()
File "main.py", line 148, in run
elogger = logger.Logger(args.log_file)
File "/home/nlg/PycharmProj2/DeepTTE-master/logger.py", line 5, in init
self.file = open('./logs/{}.log'.format(exp_name), 'w')
IOError: [Errno 2] No such file or directory: './logs/run_log.log'
(torch2.0) nlg@ubuntu:/PycharmProj2/DeepTTE-master$

dist normalize twice

I found dist is normalized in data_loader and Attr_layer
Attr layer:
dist = utils.normalize(attr['dist'], 'dist')
em_list.append(dist.view(-1, 1))

Data loader:
for key in stat_attrs:
x = torch.FloatTensor([item[key] for item in data])
attr[key] = utils.normalize(x, key)

which may lead some problem

GeoConv layer and Linear(4,16)

Hello guys and good job for the great work. I was reading the GeoConv.py and I am wondering why a Linear(4, 16) is used before the Conv1D?

Why would someone want to span more dimensions and transform the given vector(loc_i) to a 16x4 one? Is there any intuitive explanation for this?

Thanks for your time

Question about Distance Gap

Hi, thank you for the wonderful job. I think this is the first work I have seen that brings deep learning into the area of travel time estimation. I just have a question and I wonder if I could ask you about it. It is as follows:

Regarding the resampling of the data, I am curious how is the distance gap between two points calculated? Is it the on-road distance between the two points, or great circle distance based on the GPS coordinates of the two points?

Thank you.

You lose model.cuda() in main.py

You lose model.cuda() in main.py when training.

Weight File

What is the weight file?

FileNotFoundError: [Errno 2] No such file or directory: '/saved_weights'

Estimating travel time between a given start point and end point ( Long distance gap)

Hi Team,

First off, great work. This has been really helpful.
Would be great to get your inputs on a couple of points.

While trying to estimate travel time between a start point and an endpoint, is it necessary to provide
the expected trajectory to be followed by the vehicle.
Any workaround for that is possible?
I am trying this out for long distances.
Say we have A- startpoint ,B- endpoint , total distance 500kms
At the starting of the trip we provide the expected trajectory between A and B to estimate the travel time.
Then if the vehicle covers says 150km and is at an intermediate point C .
Now is it recommended to find the new ETA that we use C as the starting point?

Is there any way that we also incorporate the current driving pattern from A to C to predict the driving pattern from C to B ?

Can this model handle the ETA of the road that has traffic jam?

If the distance between two consecutive track points must be around 200 and 400 meters, points that move very slow or stay points are thrown away. So does it mean that the model may lose the ability to predict roads that suffer from traffic jam? Because many points are abandoned.

The complete data

Thanks for yours paper work!
I'm a college student who is studying the traffic time prediction. The complete data link mentioned in the 'read_me.md' is failed, it is a pity~~
So can you tell me how to download the complete data? Thanks a lot

Problem about training on the GPU

I ran the code on the pytorch-cpu-windows-py37 platform successfully, but when tring to use GPU, I have some problem. While training, the first epoch was normal, but an error occured when the second epoch started.

I dont understand why "cudnn RNN backward can only be called in training mode", Im sure I was training the model, because the "task" argument was "train". Error occurs only on GPU with the same code.

dataset

Hi,

Can you please release some test data?

Thanks.

can the code run in python 3.6

i have tried to run it at win10/cuda9.1.85/anaconda4.5.4/python3.6.5 for some time, but failed. After dealing with some little bugs because of python version from 2 to 3, i'm struggled with the problem:

E:\workbench\DeepTTE-master>python main.py --task train --batch_size 10 --result_file ./result/deeptte.res --pooling_method attention --kernel_size 3 --alpha 0.1 --log_file run_log
Training on epoch 0
Train on file train_00
Traceback (most recent call last):
File "main.py", line 161, in
run()
File "main.py", line 151, in run
train(model, elogger, train_set = config['train_set'], eval_set = config['eval_set'])
File "main.py", line 69, in train
for idx, (attr, traj) in enumerate(data_iter):
File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 239, in init
w.start()
File "D:\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "D:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "D:\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'MySet.init..'

can anybody give some suggestions??

backward_input can only be called in training mode

Hi, I run the command in readme.
And I get following error, can you help me?

Traceback (most recent call last):
File "main.py", line 161, in
run()
File "main.py", line 151, in run
train(model, elogger, train_set = config['train_set'], eval_set = config['eval_set'])
File "main.py", line 77, in train
loss.backward()
File "/home/dsc/.local/lib/python2.7/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/dsc/.local/lib/python2.7/site-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: backward_input can only be called in training mode

No module named 'DeepTTE'

Hello All!

I am a complete beginner here and a Windows user. When I try to run the code, either by running the main file or as
python main.py --task train --batch_size 10 --result_file ./result/deeptte.res --pooling_method attention --kernel_size 3 --alpha 0.1 --log_file run_log
provided in the "README.md" I get this error: "No module named 'DeepTTE'"

Can someone please comment what I could be doing wrong.

urbcomp / deeptte Goto Github PK

deeptte's Introduction

Usage:

Model Training

Parameters:

Model Evaluation

Parameters:

Example:

How to User Your Own Data

Format Instructions

deeptte's People

Contributors

Stargazers

Watchers

Forkers

deeptte's Issues

Recommend Projects

Recommend Topics

Recommend Org