Comments (10)
Hi, I have originally validated the code with the version of pytorch as posted (1.1.0). There has previously been a case of broken predictions due to changes of image normalization filter in the pytorch library. I would first try to run it with the original (bit ancient) versions of the libraries (sea requirements.txt). Generally, the main thing I would check are the data loaders. Are the pixel values scaled properly to a normalized range? Next, I would maybe try to visualize the predictions (plot them to 2d scatter plot) and correlate this with ground truth. Is there a bias or scale error? If correlation is there, it must be some data scaling issue. If there is no correlation, perhaps the checkpoint is not loaded properly. Hope it helps.
from gazecapture.
I change the version as requirements.txt shown, got the same results. And I plot the prediction gaze point and ground truth, where 'x' mark means truth and '*' mark means prediction
I am sure checkpoints loads properly , it just torch.load('checkpoint.pth.tar'), there is no other methods, isn't it? I think that it might be data normalization issue , I know I should use the function SubstractMean in ITrackerData.py, but I don't know how to visualize it to check if it works normally or not.
from gazecapture.
The means that we subtract (saved in the mat files provided) are in the 0-255 range. Therefore, depending on the image loader, one needs to either load the images as 0-255, subtract the means and then divide by 255 - OR - load the images as 0-1 and then subtract the (means/255) . If the ranges gets mixed, it will go badly. The same will happen if the input to the network is not normalized. I think in practice the inputs should be a bit unusual range [-0.5, 0.5]. In any case, you can always retrain the network (you can start from our checkpoint) and see if the error goes down.
from gazecapture.
I retrained the checkpoint model and the mean validation distance loss still high to 11 cm, but training distance loss is down to 0.4 cm, It seems there is no advantage in this training, even get overfitting. Is there other method to refine it? Really thanks for helping!
By the way, I just use about 210,000 records for train and 30,000 records for test. Does that matter to the bad result?
from gazecapture.
Ok, that is a useful observation. So you can decrease the prediction error for the training samples yet the prediction error for the test samples does not go down. What I would do next is to step by step debug what are the differences between training and test passes. Do the input data (shapes, value ranges,...) look the same? How come there is such a difference in the outcome? Can you feed the training data to your test script and get a low prediction error? In theory, the network could of course overfit and learn labels for the training data without any generalization capability, however, most likely there is some simple technical explanation to such weird behavior.
from gazecapture.
I tried to split some train data to test but it doesn't work. I have no idea how to call and observe train and test input value ranges. Should I do something in train loader loop?
Do transfer learning works in this issue? If so, which layers should I freeze first? The whole CNN layers?
from gazecapture.
I think the issue is that there is some difference between the training and test code. Try training and testing on the exactly same data. Does the train and test error become the same?
from gazecapture.
I try your method and I got a weird result : train data and test data MSE loss is 0.84 & 0.7, distance loss is 1.6 & 0. 951 (cm) , test error is less than train error! What's worse, I feed the same data as validation, the MSE loss comes up to 30 and distance loss comes up to 7 cm. I think it is saving checkpoint problem now. How to debug that part? Thanks a lot!!
from gazecapture.
It is hard to give any concrete advice in such case. Try to rewrite the code or write a minimum working example that reproduces the behavior. It may just be a tiny little network that takes single floats as input. You can also easily inspect the weights during training, before saving and after loading. Look into pytorch documentation to see how to access weights of any layer. You can then keep printing e.g. the mean and std and see if they stay the same before save and after load.
from gazecapture.
I examined the code carefully, and I find when the code starts validating(--sink), there is no command to let checkpoint be loaded. So my parameters in validation is from default, not from checkpoint. I add doLoad = args.sink in main to solve the whole problem. the solution is so easy that I feel really awful.
Really much thanks for your help!
from gazecapture.
Related Issues (20)
- training stuck problem HOT 2
- Dataset is access denied HOT 6
- Dataset download error HOT 3
- Dataset could not be downloaded HOT 1
- some question about login to download dataset HOT 10
- The dataset may not be downloaded due to registration problems HOT 3
- Can't login to download a dataset HOT 1
- Laptop Gaze Inference of dataset output HOT 6
- Webpage login problem HOT 1
- access denied HOT 2
- About data set download HOT 2
- Unable to download, VPN responsible?
- Problem with login and download HOT 7
- ABOUT ACCESSING THE DATASET
- Confused about the how to take the model output to angular error. HOT 1
- Question about the dataset split. HOT 2
- face grid how to get? HOT 1
- Something about prepareDateset.py HOT 1
- About camera parameters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gazecapture.