Comments (14)
Hey @santoshpal3004, seems like this error is raised when your serie has 3 or less samples, since the theta model can't be trained. You can specify a fallback model in the constructor to train that model instead in cases like this.
from statsforecast.
Hey @jmoralez the shape of the dataframe that I am using here is about 30,000 and I also tried keeping Holtwinters as a fallback model but still get a similar looking error but this time with "ets.py". Here is the code snippet, please it would really be helpful if you can point out any correction that has to be done.
try:
models = [HoltWinters(season_length=12, error_type='A'),
SeasonalNaive(season_length=12),
HistoricAverage(),
DOT(season_length=12, decomposition_type='additive'),
AutoTheta(season_length=12),
AutoARIMA(season_length=12),
AutoETS(season_length=12)
]
#instaniate the model
model = StatsForecast(models = models,
freq='M',
n_jobs=-1
fallback_model= HoltWinters(season_length=12, error_type='A')
)
#train model, like in sklearn
model.fit(df=X_train_agg.head(1000))
#this circumvents the error we get with autoarima
#try again without autoarima in the list of models
except ZeroDivisionError:
models = [HoltWinters(season_length=12, error_type='A'),
SeasonalNaive(season_length=12),
HistoricAverage(),
DOT(season_length=12, decomposition_type='additive'),
AutoTheta(season_length=12),
AutoETS(season_length=12)
]
model = StatsForecast(models = models,
freq='M',
n_jobs=-1,
#fallback_model= HoltWinters(season_length=12, error_type='A')
)
model.fit(X_train_agg)
from statsforecast.
The size in the errors refers to the size of a single serie (unique_id). So if you run for example df['unique_id'].value_counts()
you should see some with 3 or less values, which are the problematic ones. In that case the only viable fallback is the Naive model, you could try that one.
from statsforecast.
So just to clarify "unique_id" is the index and not a column. X_train_agg.index.nunique() yields 511 rows.
from statsforecast.
You should set it as a column, we're deprecating passing it as the index. Also the problem isn't how many unique series you have, but their sizes, so running value_counts on the unique_ids is what will tell you their sizes.
from statsforecast.
Ok point noted @jmoralez but the issue still persists even after keeping all the records with value_counts more than 3. PFA the code snippet:
value_counts = new_df['unique_id'].value_counts()
valid_indices = value_counts[value_counts >= 5].index
filtered_df = new_df[new_df['unique_id'].isin(valid_indices)]
filtered_df['unique_id'].value_counts()
**OUTPUT:**
HK 74
HK/Pharma/HP1_S01 74
HK/Pharma/HP1_J05 74
HK/Pharma/HP1_J02 74
HK/Pharma/HP1_J01 74
..
HK/CHC/TD1_C07 6
HK/CHC/TD1_A10 6
HK/CHC/HP1_A10 5
HK/CHC/DP1_A10 5
HK/Pharma/HP1_P02 5
Name: unique_id, Length: 482, dtype: int64
**ERROR AFTER USING THIS df:**
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/lib/python3.10/site-packages/statsforecast/core.py", line 73, in fit
fm[i, i_model] = new_model.fit(y=y, X=X)
File "/opt/conda/lib/python3.10/site-packages/statsforecast/models.py", line 553, in fit
self.model_ = ets_f(
File "/opt/conda/lib/python3.10/site-packages/statsforecast/ets.py", line 1235, in ets_f
raise NotImplementedError("tiny datasets")
NotImplementedError: tiny datasets
"""
The above exception was the direct cause of the following exception:
NotImplementedError Traceback (most recent call last)
Cell In[50], line 57
50 model = StatsForecast(models = models,
51 freq='M',
52 n_jobs=-1
53 #fallback_model= HoltWinters(season_length=12, error_type='A')
54 )
56 #train model, like in sklearn
---> 57 model.fit(df=filtered_df.head(1000))
59 #this circumvents the error we get with autoarima
60 #try again without autoarima in the list of models
61 except ZeroDivisionError:
File /opt/conda/lib/python3.10/site-packages/statsforecast/core.py:581, in _StatsForecast.fit(self, df, sort_df)
579 self.fitted_ = self.ga.fit(models=self.models)
580 else:
--> 581 self.fitted_ = self._fit_parallel()
582 return self
File /opt/conda/lib/python3.10/site-packages/statsforecast/core.py:940, in _StatsForecast._fit_parallel(self)
938 future = executor.apply_async(ga.fit, (self.models,))
939 futures.append(future)
--> 940 fm = np.vstack([f.get() for f in futures])
941 return fm
File /opt/conda/lib/python3.10/site-packages/statsforecast/core.py:940, in <listcomp>(.0)
938 future = executor.apply_async(ga.fit, (self.models,))
939 futures.append(future)
--> 940 fm = np.vstack([f.get() for f in futures])
941 return fm
File /opt/conda/lib/python3.10/multiprocessing/pool.py:774, in ApplyResult.get(self, timeout)
772 return self._value
773 else:
--> 774 raise self._value
NotImplementedError: tiny datasets
from statsforecast.
The ets requires more than 3 samples. The easiest fix is providing a fallback model like Naive or HistoricAverage.
from statsforecast.
As indicated in the previous response, I have a substantial amount of data at my disposal. However, it continues to yield the same error. Furthermore, the inclusion of a fallback model does not serve its intended purpose if I have certainty that the other models will unquestionably encounter failures. PFA the code snippet:
`value_counts = new_df['unique_id'].value_counts()
valid_indices = value_counts[value_counts >= 5].index
filtered_df = new_df[new_df['unique_id'].isin(valid_indices)]
filtered_df['unique_id'].value_counts()
OUTPUT:
HK 74
HK/Pharma/HP1_S01 74
HK/Pharma/HP1_J05 74
HK/Pharma/HP1_J02 74
HK/Pharma/HP1_J01 74
..
HK/CHC/TD1_C07 6
HK/CHC/TD1_A10 6
HK/CHC/HP1_A10 5
HK/CHC/DP1_A10 5
HK/Pharma/HP1_P02 5
Name: unique_id, Length: 482, dtype: int64`
from statsforecast.
statsforecast trains the models per serie, so it doesn't matter how much data you have in total, but how many samples each serie has. You have some series with 5 samples, for which an ets model can't be trained, so in those cases it will fail. If you specify a fallback model those series will be forecasted using that fallback model when a more complex model fails (ets in this case).
from statsforecast.
What is the minimum number of samples each series should have for it to be trained using ETS based models?
from statsforecast.
Here's the relevant part of the code
statsforecast/statsforecast/ets.py
Lines 1226 to 1236 in be9db75
The defaults are
model='ZZZ'
and damped=None
so if you keep those you need at least 7 samples. Keep in mind that even though it may train it probably won't be very good with that few samples, since it doesn't even cover one seasonal period (12).from statsforecast.
Ok, but I recently tried keeping the number of samples per series as 18 still was not able to fit any models. The error which I get this looks something like this:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/lib/python3.10/site-packages/statsforecast/core.py", line 73, in fit
fm[i, i_model] = new_model.fit(y=y, X=X)
File "/opt/conda/lib/python3.10/site-packages/statsforecast/models.py", line 553, in fit
self.model_ = ets_f(
File "/opt/conda/lib/python3.10/site-packages/statsforecast/ets.py", line 1300, in ets_f
raise Exception("no model able to be fitted")
Exception: no model able to be fitted
from statsforecast.
Can you try keeping at least two seasonal periods (24)?
from statsforecast.
This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.
from statsforecast.
Related Issues (20)
- Add a check for fitting a theta model with less than two seasonal periods
- Missing Amazon Forecast results in the experiment docs HOT 4
- AttributeError: 'bool' object has no attribute 'n_windows' when adding confidence level HOT 4
- SeasonalNaive forecasts are not as expected; expected lag 12 but forecast is rounded and slightly off HOT 2
- Getting error when training AutoCES on covid deaths dataset from gluonts HOT 3
- StatsForecast: No model able to be fitted HOT 1
- Can statsforecast detect season_length automatically? HOT 2
- ValueError: xreg is rank deficient HOT 4
- Models: documented public methods/attributes to extract best parameters for Auto* models HOT 3
- Auto_Ces "Exception: no model able to be fitted" HOT 4
- Problem getting fitted values using cross validation with a spark dataframe HOT 2
- Problem using fit and predict or fit_predict using a spark dataframe HOT 1
- Model evaluation documentation asserts that datasetsforecast is required, but it isn't HOT 2
- Deprecate references to custom `evaluate_performace` function in favor of `utilsforecast.evaluation.evaluate`? HOT 2
- Should `forecast_fitted_values` also work for fitted models in addition to when forecast(fitted=True) is called? HOT 1
- Dead links for several pages in docs HOT 1
- MSTL can't use AutoETS as trend_forecaster HOT 1
- Can I use Exogenous features with AutoETS HOT 1
- statsforecast: prediction fails if more than one model uses fallbackheuristic HOT 2
- ARIMA fit asymmetrical var_coef for some parameters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from statsforecast.