Comments (2)
thanks @ageron , makes sense
from handson-ml.
Hi @singh-krishan ,
Thanks for your question. Make sure you fit estimators only on training data. This means you should call fit()
or fit_transform()
or fit_predict()
only on training data, never on other data (such as the validation set, the test set, or new data). In your code, you should therefore replace full_pipeline.fit_transform(some_data)
with full_pipeline.transform(some_data)
. However, before you do that, you should first fit the model on the training set.
So the code should look like:
housing_prepared = full_pipeline.fit_transform(housing)
some_data_prepared = full_pipeline.transform(some_data)
In the full training set, there are 5 distinct values in the ocean_proximity
column. That's why after the full_pipeline
is fit on the training set, it outputs one-hot vectors of size 5 for each ocean_proximity
category. But if some_data
is small enough, it is likely to contain less categories, which is what you observed. But if you only call transform(some_data)
and not fit_transform(some_data)
, it will output one-hot vectors of size 5.
Hope this helps.
from handson-ml.
Related Issues (20)
- mnist dataset HOT 2
- Chapter#02 FileNotFoundError HOT 1
- Ml
- Dropout at test time HOT 3
- How can I use my own dataset and fit it to your code
- Need help understanding crc hash used to explain test train split in Chapter 2 HOT 1
- ImportError: cannot import name 'fetch_mldata' from 'sklearn.datasets' (F:\Anaconda3\lib\site-packages\sklearn\datasets\__init__.py) HOT 1
- Chapter 3 : Exercise 1 - MNIST Classifier with 97% accuracy - Could not pickle the task to send it to the workers. HOT 3
- Broken image in readme HOT 1
- Chapter 5 SVM why should center before LinearSVC
- Chapter 3 (Page 82): Getting error during Fitting the SGD Classifier with Training data
- Chapter 2: Value differences in prediction
- Chapter 2: Looking for Correlations - ValueError: could not convert string to float: 'INLAND' HOT 1
- Use github.com/apssouza22/chatflow as a conversational layer. It would enable actual API requests to be carried out from natural language inputs.
- chapter 4: SGDRegressor(tol=-np.infty) is not accepted by the module HOT 1
- Hi
- Ch.2 Error using corr() HOT 1
- Problem downloading data HOT 1
- Why does saving the test set not work?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from handson-ml.