Git Product home page Git Product logo

n-hits's People

Contributors

azulgarza avatar cchallu avatar kdgutier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

n-hits's Issues

Clarification regarding implementation

Hi! I have some queries about the following line of code - this seems make your implemented model somewhat different from what you specified in your paper:

forecast = insample_y[:, -1:] # Level with Naive1

My understanding is that this line of code specifies that the final forecast also includes the first value of the lookback window, meaning you are predicting the "change in value" rather than the actual time series value. Is there any reason for doing this? Thank you for your time!

About your data processing + final result

In your code, seems you normalized the orginal data. but when you calculated the MSE and mae, you didn't transform them back to the normal scale? Then are the mse and mae wrong in such case ?

different settings (nhits vs. autoformer)

Hi! Thank you for sharing your source code.

I have some questions about the settings of NHITS and Autoformer.

I think there might be some unfair comparisons in your Tab 2 because you compared the Autoformer's reported results but used different settings in the NHITS model.

Q1: the length of the history window
you use 5*args.horizon for NHITS. But for Autoformer, you use a shorter length (say, 1*args.horizon.) Here args.horizon=96.

When using a history length of 5*96, your reported result of ECL-96 is 0.147 (I can reproduce this by re-running your released code). The Autoformer's reported result is 0.201 (use only a 96-length window).

I tried some experiments and get results as follows:

using the same setting for NHITS (96-length window), the result of ECL-96 is MSE: 0.1902 / MAE: 0.2739

it seems the length of history window is an important hyperparameter.

By the way, using 5*96-length window for NBeats model, I get a much better result of ECL-96 is MSE: 0.1340 / MAE: 0.2311

Q2: the spilt of train/val/test set
you use masks (train_mask_df, valid_mask_df, test_mask_df) to indicate the parts of train/valid/test.
However, in autoformer's setting (see https://github.com/thuml/Autoformer) the borders are

border1s = [0, num_train - self.seq_len, len(df_raw) - num_test - self.seq_len]
border2s = [num_train, num_train + num_vali, len(df_raw)]

Here, it seems you did not use the overlap part like [num_train - self.seq_len, num_train + num_vali]

So my question here is whether the same number of test samples are used for evaluation.
If not, I think it might be unfair to directly compare Autoformer's results in your Tab 2.

Follow up to "change-in-level" forecast

Hi @cchallu, this is a follow up to issue #8. I was also wondering if it would be better to use the last value of the lookback window instead of the first value in the lookback window, mainly because the first value is sometimes masked?

Clarification regarding data normalization

Hello,

I was trying to run N-HiTS with my own data using the shared colab

I tried to normalize the original EETm2 dataset and compared it with the data used in your N-HiTS model.

The size of df_train is 46641, and I followed the information given in section 4.1: Each set is normalized with the train data mean and standard deviation.

def normalize(df_csv, df_train):
result = df_csv.copy()
columns_names = list(df_csv.columns)
for feature_name in columns_names[1:]:
result[feature_name] = (df_csv[feature_name] - df_train[feature_name].mean()) / df_train[feature_name].std()
return result

My function return different result comparing to yours:
date HUFL
2016-07-01 00:00:00 0.126520
2016-07-01 00:15:00 -0.023339
2016-07-01 00:30:00 -0.098268
2016-07-01 00:45:00 -0.431177
2016-07-01 01:00:00 -0.231432
Name: HUFL, dtype: float64

and yours:
unique_id | ds | y
HUFL | 2016-07-01 00:00:00 | -0.041413
HUFL | 2016-07-01 00:15:00 | -0.185467
HUFL | 2016-07-01 00:30:00 | -0.257495
HUFL | 2016-07-01 00:45:00 | -0.577510
HUFL | 2016-07-01 01:00:00 | -0.385501

Can you please tell me more about the data normalization process?

Thanks and regards,

Sophie

Reproducing Results

Hello,

I downloaded the repository to my computer and tried to reproduce the results that were published in the paper for the traffic dataset with a prediction window of length 96. I ran the code with the following args:

--hyperopt_max_evals
10
--experiment_id
run_1

But the results were 0.504 for the MSE and 0.311 for the MAE which is significantly worse than what I was expecting to achieve. Is there anything else that needs to be done before running the code and training the model in order to reproduce the results?

Thanks in advance!

I can't see the model's detail in the code

I want see the model's detail in the code,but i found the Pytorch Lightning in the pycharm can't debug, they just run,how can i see the training data flows in the model? And it will makes me understand the model better. Thank you.

/

.

Question on n_time_in

Hi,

Thank you for publishing your code and also thanks for your interesting paper. I am now trying to use your code but I am not sure if I need to update the hyper opt space for n_time_in?

The current settings in nhits_multivariate is set to 'n_time_in': hp.choice('n_time_in', [5*args.horizon]) which results 960 inputs for a horizon length of 192, I was wondering is it the one used in your experiments or should I changed it to 96?

Thanks

How to make multivariate forecasting?

Hi, I read the code in n_hits_multivariate.py, but get confused of the way the datasets are loaded. In the tsdataset.py, the DataFrame Y_df is defined as 'Target time series with columns ['unique_id', 'ds', 'y']', well, take ETT dataset as an example, there are one column as 'date' and other 7 columns as 'variables of different nodes', currently I view the 'unique_id' part as the default index of panda.DataFrame, so what is the 'ds' and 'y'? What's more, it seems that N-Hits works in an univariate way in the following line:

forecast = insample_y[:, -1:] # Level with Naive1

That makes me confused, how does N-Hits make multivariate predictions? By individually yielding prediction results of each univariate variable?

About training procedures and doc

Update: Additional questions

  • Your data pipeline seems quite non-traditional for me. At each training step, you randomly sample 256 windows from one time series as model input. A training epoch is finished by sampling each series once. I understand that it's a univariate model, but I don't see why you leave it to probability to cover the entire training span.

  • I tried an ablation by feeding the data in multivariate fashion, i.e. input a history of all variables, roll windows along time dimension, learning (N, S) -> (N, T) where N == num_series. The result was bad on traffic dataset. Could you help explain?

  • The paper says that you have lr halved three times across the training procedure. However, you mis-configured your pl_module. The default lr_schedule interval is epoch (ref. https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html#configure-optimizers), which means that you actually kept training with initial lr till the end.

  • You chose a training step of 1000, which is conservative considering your data feeding. For example, each ts is covered at most twice using traffic dataset. Training more steps slightly improved over your reported results on traffic dataset (at least).

I hope these could help improve your model (Of course the metric presented is already impressive enough :).

===============================================
Thank you for this amazing work. I found these typo and doc issues:

"""
N-HiTS model.
Parameters
----------
n_time_in: int
Multiplier to get insample size.
Insample size = n_time_in * output_size

  1. while documented as a multiplier, n_time_in is actually the final Lookback period

for i in range(len(stack_types)):
#print(f'| -- Stack {stack_types[i]} (#{i})')
for block_id in range(n_blocks[i]):

  1. n_layers in nhits_multivariate.py should be [ 3*[2] ] rather than 9 since elements are indexed across 3 stacks

  2. loss_hypar should be an int like 7 or 24 from its context

  3. There are bypassed logics for exogenous variables in nhits model. I wonder if they can be put into work now?

Is backcast interpolated?

def forward(self, theta: t.Tensor, insample_x_t: t.Tensor, outsample_x_t: t.Tensor) -> Tuple[t.Tensor, t.Tensor]:
backcast = theta[:, :self.backcast_size]
knots = theta[:, self.backcast_size:]
if self.interpolation_mode=='nearest':
knots = knots[:,None,:]
forecast = F.interpolate(knots, size=self.forecast_size, mode=self.interpolation_mode)
forecast = forecast[:,0,:]
elif self.interpolation_mode=='linear':
knots = knots[:,None,:]
forecast = F.interpolate(knots, size=self.forecast_size, mode=self.interpolation_mode) #, align_corners=True)
forecast = forecast[:,0,:]
elif 'cubic' in self.interpolation_mode:

n_theta = (n_time_in + max(n_time_out//n_freq_downsample[i], 1) )
basis = IdentityBasis(backcast_size=n_time_in,
forecast_size=n_time_out,
interpolation_mode=interpolation_mode)

output_layer = [nn.Linear(in_features=n_theta_hidden[-1], out_features=n_theta)]
layers = hidden_layers + output_layer

According to these code blocks, It seems that Interpolation is used for synthesizing forecast only and the backcast is generated directly thru MLP. But Eq. 3 of your paper 3.3 states that forecast and backcast are interpolated in a similar way. Is there any reason behind this discrepency?

Thank you for your time!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.