Comments (11)
Also tried the libsvm format using dump_svmlight_file:
Arguments: train
[2018-01-10:23:52:25:INFO] Running standalone xgboost training.
[2018-01-10:23:52:25:INFO] File size need to be processed in the node: 0.57mb. Available memory size in the node: 8618.42mb
[2018-01-10:23:52:25:ERROR] Customer Error: Blankspace and colon not found in the file. ContentType by defaullt is in libsvm. Please ensure the file is in libsvm format.
Traceback (most recent call last):
File "/opt/amazon/lib/python2.7/site-packages/sage_xgboost/train.py", line 34, in main
standalone_train(resource_config, train_config, data_config)
File "/opt/amazon/lib/python2.7/site-packages/sage_xgboost/train_methods.py", line 16, in standalone_train
train_job(resource_config, train_config, data_config)
File "/opt/amazon/lib/python2.7/site-packages/sage_xgboost/train_helper.py", line 386, in train_job
validate_file_format(train_path, file_type)
File "/opt/amazon/lib/python2.7/site-packages/sage_xgboost/train_helper.py", line 254, in validate_file_format
validate_libsvm_format(os.path.join(files_path, data_file))
File "/opt/amazon/lib/python2.7/site-packages/sage_xgboost/train_helper.py", line 273, in validate_libsvm_format
Please ensure the file is in libsvm format.")
CustomerError: Blankspace and colon not found in the file. ContentType by defaullt is in libsvm. Please ensure the file is in libsvm format.
ValueErrorTraceback (most recent call last)
in ()
16 num_round=100)
17
---> 18 xgb.fit({'train': s3_input_train, 'validation': s3_input_validation})
/home/ec2-user/anaconda3/envs/python2/lib/python2.7/site-packages/sagemaker/estimator.pyc in fit(self, inputs, wait, logs, job_name)
152 self.latest_training_job = _TrainingJob.start_new(self, inputs)
153 if wait:
--> 154 self.latest_training_job.wait(logs=logs)
155 else:
156 raise NotImplemented('Asynchronous fit not available')
/home/ec2-user/anaconda3/envs/python2/lib/python2.7/site-packages/sagemaker/estimator.pyc in wait(self, logs)
321 def wait(self, logs=True):
322 if logs:
--> 323 self.sagemaker_session.logs_for_job(self.job_name, wait=True)
324 else:
325 self.sagemaker_session.wait_for_job(self.job_name)
/home/ec2-user/anaconda3/envs/python2/lib/python2.7/site-packages/sagemaker/session.pyc in logs_for_job(self, job_name, wait, poll)
656
657 if wait:
--> 658 self._check_job_status(job_name, description)
659 if dot:
660 print()
/home/ec2-user/anaconda3/envs/python2/lib/python2.7/site-packages/sagemaker/session.pyc in _check_job_status(self, job, desc)
399 if status != 'Completed':
400 reason = desc.get('FailureReason', '(No reason provided)')
--> 401 raise ValueError('Error training {}: {} Reason: {}'.format(job, status, reason))
402
403 def wait_for_endpoint(self, endpoint, poll=5):
ValueError: Error training xgboost-2018-01-10-23-46-28-723: Failed Reason: ClientError: Blankspace and colon not found in the file. ContentType by defaullt is in libsvm. Please ensure the file is in libsvm format.
from amazon-sagemaker-examples.
Thanks @johnl8888 , and sorry you're running into troubles. Would you be able to provide the top few records of both files? This will help us troubleshoot the issue.
from amazon-sagemaker-examples.
from amazon-sagemaker-examples.
Thanks, @johnl8888 . Unfortunately, it doesn't look like the email attachment came through in my email or in the GitHub comments. However, CSVs passed to XGBoost need to be in a specific format:
- No header row
- Outcome variable in the first column, features in the rest of the columns (there's no ability to drop them during the training process)
- All columns need to be numeric
In the example notebook we actually read in a CSV that doesn't conform to these standards and then transform it and re-output a version that does with .to_csv()
. We then send the transformed version to S3 for the training job.
If you still have trouble with running a training job after this, feel free to just dump the first 10 lines of the CSV you're passing to the algorithm into the comments section here, and that should give us the next direction to go for troubleshooting.
Thanks!
from amazon-sagemaker-examples.
from amazon-sagemaker-examples.
from amazon-sagemaker-examples.
Thanks @johnl8888 . It sounds like at least you're up and running with the LibSVM format. Just to confirm, the CSV file should not have any blankspaces or colons in it (as your error suggests is happening). It should only contain numeric values with a single delimiter (typically comma).
One other thing that might be happening... Can you make sure the CSV is stored in its own S3 prefix. If you have both the LibSVM file and the CSV file sitting in s3://my-bucket/xgboost-test/train/, and you pass that as your training location to SageMaker, then both files will be loaded to the training instance and the algorithm may be confused and loading the LibSVM file but thinking it's a CSV.
from amazon-sagemaker-examples.
@johnl8888 , I'll close this issue for now as I'm hoping you were able to get up and running based on our last exchange. Feel free to re-open if needed. Thanks again for your interest in SageMaker Example Notebooks!
from amazon-sagemaker-examples.
Hello,
I am having this problem and I don't know how to solve. Supposedly the csv is already matching the format and the data is in its own bucket. The error I get is the following:
Error for Training job xgboost-2019-03-13-16-21-25-000: Failed Reason: ClientError: Blankspace and colon not found in firstline '0.0,0.0,99.0,314.07,1.0,0.0,0.0,0.0,0.48027846,0.0...' of file 'train.csv'
We can see that the label is in the first row and the others are the features, no headers and everything numerical, so I am wondering what am I doing wrong.
from amazon-sagemaker-examples.
I am getting a similar error and followed the exact same steps.
No header row
Outcome variable in the first column, features in the rest of the columns (there's no ability to drop them during the training process)
All columns need to be numeric
Here is snapshot of the data:
0.0,-1.0,-1.0,0.43,-0.6397578,-0.0030769934,-0.3481717,-0.6736527,0.52619594,-0.57142854,-0.12195122,-0.13138686,0.15079366,0.2798353,-0.07718044,0.3561645,-0.5319149,0.17164181,0.29268286,-1.0284938,-0.39880952,0.30730823,-0.09433937,0.1566265,-0.17105263,-0.4765625,-0.25,0.36363637,0.30769232,-0.25,-0.6870229,0.37499985
Should I switched to using Libsvm?
from amazon-sagemaker-examples.
I too switched to Libsvm format and it worked....
from amazon-sagemaker-examples.
Related Issues (20)
- [Bug Report] All fine tunes for Mistral 7b using sagemaker jumpstart are currently failing. HOT 3
- Issues in Training Module
- object_detection_birds - numpy depency issue
- bug report in wrong repo
- BYO MME example notebook failing due to MXNet retirement
- [Bug Report] RuntimeError: Dataset not found. You can use download=True to download it for pytorch minist horovod
- Dataset not working in example in notebook A Move Amazon SageMaker Autopilot ML models from experimentation to production using Amazon SageMaker Pipelines
- Broken lnks HOT 1
- How do you use the custom generator to train the TensorFlow model on PageMaker?
- [Example Request] Minimal Example for Fine Tuning a LLM with FSDP utilizing the HuggingFace Trainer
- [Bug Report] Forbidden(403) on Introduction to JumpStart - Sentence Pair Classification
- getting error:
- Getting "TypeError: can only join an iterable" while running "print(predictor.predict(test_data).decode("utf-8"))"
- [Bug Report] Example notebook has incorrectly formatted serving.properties
- AttributeError: module 'pandas.core.strings' has no attribute 'StringMethods'
- Inference Recommender Job fails
- [Bug Report]Error with using dgl library in Sagemaker
- Deploy this TheBloke/vicuna-13B-v1.5-GGUF model on AWS
- Parameter validation failed: Unknown parameter in PrimaryContainer HOT 2
- [Bug Report] - README - Train EleutherAI GPT-J with Model Parallel Link Broken
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from amazon-sagemaker-examples.