n-hits's People
Forkers
valeman statmixedml jingmouren weiforreference jchlapowski pugangqiang taenikim koszpe dhockaday outfox330 wxhxw vishalbelsare butterm-40 qilinwang sdoof xiaofeng12374 barryallen12 mivanovitch osmium18452 silvaac myheathcliff cyalcher cleanerleon milton-miranda onzo191 dan-seoln-hits's Issues
what is the file 'hyperopt' ?
what is the file 'hyperopt' ?
Clarification regarding implementation
Hi! I have some queries about the following line of code - this seems make your implemented model somewhat different from what you specified in your paper:
n-hits/src/models/nhits/nhits.py
Line 330 in 7b12bd3
My understanding is that this line of code specifies that the final forecast also includes the first value of the lookback window, meaning you are predicting the "change in value" rather than the actual time series value. Is there any reason for doing this? Thank you for your time!
About your data processing + final result
In your code, seems you normalized the orginal data. but when you calculated the MSE and mae, you didn't transform them back to the normal scale? Then are the mse and mae wrong in such case ?
different settings (nhits vs. autoformer)
Hi! Thank you for sharing your source code.
I have some questions about the settings of NHITS and Autoformer.
I think there might be some unfair comparisons in your Tab 2 because you compared the Autoformer's reported results but used different settings in the NHITS model.
Q1: the length of the history window
you use 5*args.horizon for NHITS. But for Autoformer, you use a shorter length (say, 1*args.horizon.) Here args.horizon=96.
When using a history length of 5*96, your reported result of ECL-96 is 0.147 (I can reproduce this by re-running your released code). The Autoformer's reported result is 0.201 (use only a 96-length window).
I tried some experiments and get results as follows:
using the same setting for NHITS (96-length window), the result of ECL-96 is MSE: 0.1902 / MAE: 0.2739
it seems the length of history window is an important hyperparameter.
By the way, using 5*96-length window for NBeats model, I get a much better result of ECL-96 is MSE: 0.1340 / MAE: 0.2311
Q2: the spilt of train/val/test set
you use masks (train_mask_df, valid_mask_df, test_mask_df) to indicate the parts of train/valid/test.
However, in autoformer's setting (see https://github.com/thuml/Autoformer) the borders are
border1s = [0, num_train - self.seq_len, len(df_raw) - num_test - self.seq_len]
border2s = [num_train, num_train + num_vali, len(df_raw)]
Here, it seems you did not use the overlap part like [num_train - self.seq_len, num_train + num_vali]
So my question here is whether the same number of test samples are used for evaluation.
If not, I think it might be unfair to directly compare Autoformer's results in your Tab 2.
Clarification regarding data pre-processing
Hello,
I was trying to run N-HiTS.
Can you please let me know the motivation behind this data preprocessing step:
https://github.com/cchallu/n-hits/blob/main/src/data/datasets/ett.py#L45
Could you also provide some insight on the usage of the above mentioned code in the program.
Thank you
Follow up to "change-in-level" forecast
Clarification regarding data normalization
Hello,
I was trying to run N-HiTS with my own data using the shared colab
I tried to normalize the original EETm2 dataset and compared it with the data used in your N-HiTS model.
The size of df_train is 46641, and I followed the information given in section 4.1: Each set is normalized with the train data mean and standard deviation.
def normalize(df_csv, df_train):
result = df_csv.copy()
columns_names = list(df_csv.columns)
for feature_name in columns_names[1:]:
result[feature_name] = (df_csv[feature_name] - df_train[feature_name].mean()) / df_train[feature_name].std()
return result
My function return different result comparing to yours:
date HUFL
2016-07-01 00:00:00 0.126520
2016-07-01 00:15:00 -0.023339
2016-07-01 00:30:00 -0.098268
2016-07-01 00:45:00 -0.431177
2016-07-01 01:00:00 -0.231432
Name: HUFL, dtype: float64
and yours:
unique_id | ds | y
HUFL | 2016-07-01 00:00:00 | -0.041413
HUFL | 2016-07-01 00:15:00 | -0.185467
HUFL | 2016-07-01 00:30:00 | -0.257495
HUFL | 2016-07-01 00:45:00 | -0.577510
HUFL | 2016-07-01 01:00:00 | -0.385501
Can you please tell me more about the data normalization process?
Thanks and regards,
Sophie
Reproducing Results
Hello,
I downloaded the repository to my computer and tried to reproduce the results that were published in the paper for the traffic dataset with a prediction window of length 96. I ran the code with the following args:
--hyperopt_max_evals
10
--experiment_id
run_1
But the results were 0.504 for the MSE and 0.311 for the MAE which is significantly worse than what I was expecting to achieve. Is there anything else that needs to be done before running the code and training the model in order to reproduce the results?
Thanks in advance!
I can't see the model's detail in the code
I want see the model's detail in the code,but i found the Pytorch Lightning in the pycharm can't debug, they just run,how can i see the training data flows in the model? And it will makes me understand the model better. Thank you.
/
.
Question on n_time_in
Hi,
Thank you for publishing your code and also thanks for your interesting paper. I am now trying to use your code but I am not sure if I need to update the hyper opt space for n_time_in
?
The current settings in nhits_multivariate is set to 'n_time_in': hp.choice('n_time_in', [5*args.horizon])
which results 960 inputs for a horizon length of 192, I was wondering is it the one used in your experiments or should I changed it to 96?
Thanks
How to make multivariate forecasting?
Hi, I read the code in n_hits_multivariate.py
, but get confused of the way the datasets are loaded. In the tsdataset.py, the DataFrame Y_df
is defined as 'Target time series with columns ['unique_id', 'ds', 'y']', well, take ETT dataset as an example, there are one column as 'date' and other 7 columns as 'variables of different nodes', currently I view the 'unique_id' part as the default index of panda.DataFrame, so what is the 'ds' and 'y'? What's more, it seems that N-Hits works in an univariate way in the following line:
n-hits/src/models/nhits/nhits.py
Line 330 in d882ee6
That makes me confused, how does N-Hits make multivariate predictions? By individually yielding prediction results of each univariate variable?
About training procedures and doc
Update: Additional questions
-
Your data pipeline seems quite non-traditional for me. At each training step, you randomly sample
256
windows from one time series as model input. A training epoch is finished by sampling each series once. I understand that it's a univariate model, but I don't see why you leave it to probability to cover the entire training span. -
I tried an ablation by feeding the data in multivariate fashion, i.e. input a history of all variables, roll windows along time dimension, learning (N, S) -> (N, T) where N == num_series. The result was bad on
traffic
dataset. Could you help explain? -
The paper says that you have lr
halved three times across the training procedure
. However, you mis-configured your pl_module. The default lr_schedule interval isepoch
(ref. https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html#configure-optimizers), which means that you actually kept training with initial lr till the end. -
You chose a training step of 1000, which is conservative considering your data feeding. For example, each ts is covered at most twice using
traffic
dataset. Training more steps slightly improved over your reported results ontraffic
dataset (at least).
I hope these could help improve your model (Of course the metric presented is already impressive enough :).
===============================================
Thank you for this amazing work. I found these typo and doc issues:
n-hits/src/models/nhits/nhits.py
Lines 398 to 405 in 4e929ed
- while documented as a multiplier,
n_time_in
is actually the final Lookback period
n-hits/src/models/nhits/nhits.py
Lines 248 to 250 in 4e929ed
-
n_layers
innhits_multivariate.py
should be[ 3*[2] ]
rather than 9 since elements are indexed across 3 stacks -
loss_hypar
should be anint
like 7 or 24 from its context -
There are bypassed logics for exogenous variables in nhits model. I wonder if they can be put into work now?
Is backcast interpolated?
n-hits/src/models/nhits/nhits.py
Lines 55 to 68 in 4e929ed
n-hits/src/models/nhits/nhits.py
Lines 263 to 266 in 4e929ed
n-hits/src/models/nhits/nhits.py
Lines 156 to 157 in 4e929ed
According to these code blocks, It seems that Interpolation is used for synthesizing forecast only and the backcast is generated directly thru MLP. But Eq. 3 of your paper 3.3 states that forecast and backcast are interpolated in a similar way. Is there any reason behind this discrepency?
Thank you for your time!
I am confuse about this line : n_theta = (input_size + max(h//n_freq_downsample[i], 1) )
I am confuse about this line :
n_theta = (input_size + max(h//n_freq_downsample[i], 1) )
I think the n_theta is larger than the input_size , but freq_downsample doesn't means the output_size should smaller than input_size?
I means inorder to do Interpolation-----the up-sampling?
Can you help me ? Thank you!!
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.