Downloaded NFLX data from 2014 - 2021. Trained the model using default parameters. And

Sure. I am attaching the link to the dataset. <a href="https://drive

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Also <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Trained model on NFLX stock data from 2014-2021. Predictions are scaled wrongly. about time-series-autoencoder HOT 10 CLOSED

julesbelveze commented on May 30, 2024

Trained model on NFLX stock data from 2014-2021. Predictions are scaled wrongly.

from time-series-autoencoder.

Comments (10)

Rajmehta123 commented on May 30, 2024 1

Sure. I am attaching the link to the dataset.

https://drive.google.com/file/d/1gyEvwkyUKzWDUHOxCQcZXivoOv0sOmtz/view?usp=sharing

Below are the configs.

parser = argparse.ArgumentParser()
parser.add_argument("--batch-size", default=config["batch_size"], type=int, help="batch size")
parser.add_argument("--output-size", default=config["output_size"], type=int,
                    help="size of the ouput: default value to 1 for forecasting")
parser.add_argument("--label-col", default=config["label_col"], type=str, help="name of the target column")
parser.add_argument("--input-att", default=config["input_att"], type=lambda x: (str(x).lower() == "true"),
                    help="whether or not activate the input attention mechanism")
parser.add_argument("--temporal-att", default=config["temporal_att"], type=lambda x: (str(x).lower() == "true"),
                    help="whether or not activate the temporal attention mechanism")
parser.add_argument("--seq-len", default=config["seq_len"], type=int, help="window length to use for forecasting")
parser.add_argument("--hidden-size-encoder", default=config["hidden_size_encoder"], type=int,
                    help="size of the encoder's hidden states")
parser.add_argument("--hidden-size-decoder", default=config["hidden_size_decoder"], type=int,
                    help="size of the decoder's hidden states")
parser.add_argument("--reg-factor1", default=config["reg_factor1"], type=float,
                    help="contribution factor of the L1 regularization if using a sparse autoencoder")
parser.add_argument("--reg-factor2", default=config["reg_factor2"], type=float,
                    help="contribution factor of the L2 regularization if using a sparse autoencoder")
parser.add_argument("--reg1", default=config["reg1"], type=lambda x: (str(x).lower() == "true"),
                    help="activate/deactivate L1 regularization")
parser.add_argument("--reg2", default=config["reg2"], type=lambda x: (str(x).lower() == "true"),
                    help="activate/deactivate L2 regularization")
parser.add_argument("--denoising", default=config["denoising"], type=lambda x: (str(x).lower() == "true"),
                    help="whether or not to use a denoising autoencoder")
parser.add_argument("--do-train", default=True, type=lambda x: (str(x).lower() == "true"),
                    help="whether or not to train the model")
parser.add_argument("--do-eval", default=True, type=lambda x: (str(x).lower() == "true"),
                    help="whether or not evaluating the mode")
parser.add_argument("--data-path", default='nflx.csv', help="path to data file")
parser.add_argument("--output-dir", default=config["output_dir"], help="name of folder to output files")
parser.add_argument("--ckpt", default=None, help="checkpoint path for evaluation")


df = pd.read_csv(config["data_path"])
df = df.set_index('Date_Time')

if not os.path.exists(config["output_dir"]):
    os.makedirs(config["output_dir"])

ts = TimeSeriesDataset(
    data=df,
    categorical_cols=config["categorical_cols"],
    target_col=config["label_col"],
    seq_length=config["seq_len"],
    prediction_window=config["prediction_window"]
)

For the config file.:

device=torch.device('cuda' if torch.cuda.is_available() else 'cpu'),

categorical_cols=["day"],  # name of columns containing categorical variables
label_col=["Close"],  # name of target column
index_col="Date",

output_size=1,  # for forecasting

num_epochs=100,
batch_size=16,
lr=1e-5,
reg1=True,
reg2=False,
reg_factor1=1e-4,
reg_factor2=1e-4,
seq_len=10,  # previous timestamps to use
prediction_window=1,  # number of timestamps to forecast
hidden_size_encoder=128,
hidden_size_decoder=128,
input_att=True,
temporal_att=True,
denoising=False,
directions=1,

max_grad_norm=0.1,
gradient_accumulation_steps=1,
logging_steps=100,
lrs_step_size=5000,

output_dir="output",
save_steps=5000,
eval_during_training=True

from time-series-autoencoder.

Rajmehta123 commented on May 30, 2024 1

With the correct hyperparam, I could get a reasonable-performing model. All good now.

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

Hey @Rajmehta123 thanks for pointing that out. There's indeed something weird going on with the scaling.
I'm on holidays atm but will check that out as soon as I get back 🙂

PS: if you find out what is actually wrong please let me know. And if you have time to open PR that would fix it I would really appreciate it.

from time-series-autoencoder.

Rajmehta123 commented on May 30, 2024

The dataset loader is using ColumnarTransformer in a pipeline. There is no way to inverse_transform it back. Also, I removed the normalization method just to see how the predictions are, but in vain. Even with no normalization, the predictions are in the range of $5-$20.

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

Sounds like a bug yeah. I'll check that out.
Do you have a link where I can get the dataset you're using?

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

Also @Rajmehta123 could you provide me the configuration you're using please

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

Hey @Rajmehta123 there's a way to invert the scaler by directly accessing it:

self.preprocessor.named_transformers_["scaler"].inverse_transform(data)

I'm working on a "clean" way to convert the normalized values to the actualy ones 😃

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

@Rajmehta123 I've made the changes and managed to have it to work properly on my local with your configuration 😃
If you could check that it works properly for you before I merge the PR that would be awesome: #19

from time-series-autoencoder.

JulesBelveze commented on May 30, 2024

@Rajmehta123 feel free to reopen it if it still doesn't work for you.

from time-series-autoencoder.

Rajmehta123 commented on May 30, 2024

Hey Jules. I just tried the PR but still, I am not able to produce the predictions. Can you share your config used on that NFLX dataset?

********* EVAL REPORT ********
MSE = 194079.140625
residuals = -430.42496
loss = tensor(8.7637)

from time-series-autoencoder.

Trained model on NFLX stock data from 2014-2021. Predictions are scaled wrongly. about time-series-autoencoder HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent