Git Product home page Git Product logo

gazecapture's People

Contributors

adikhosla avatar dependabot[bot] avatar enabledisplay avatar erkil1452 avatar jaybeavers avatar kjkjava avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gazecapture's Issues

What is train_y and val_y in dataset?

I expect both train_y and val_y are the 2D coordination of eye graze .

But the result is quite strange. Most of the coordination are very small or negative. That means most eye graze always point at the left top side. But the fact doesn't like that. Most of them are wrong.

1

Please concern the circle and the center of circle. The center is the predicted point of eye gaze.

2
3

Somebody told me that the data was normalized, but how? How do I find the true gaze point.

About train and test augmentation mentioned in the paper

Hello, could you pls share the parameters of train/test augmentation that are mentioned in the Itracker paper?
The only description in the paper was 'shifting the eyes and the face, changing face grid appropriately.' Could u pls tell us the values/ranges? We just can't reproduce ur accuracy with test augmentions...
Thanks!

How to use the caffe model and the relation about pytorch code and the trained model?

HI, my nice friend! I have to disturb you again.
Now I want to use the itracker_iter_92000.caffemodel directly for the inference. but I met some problems, I want to be clear more.

  1. I saw the code, the training and testing code is pytorch, I wonder the caffemodel is trained by torch or by caffe code?
  2. If I want to inference the image, whether I will do by the way as the pytorch code does
    (1) the image is read by RGB format, then divde 255 to 0--1 range
    (2) I think the mean image is RGB sequence to, mean image still divdes 255 to 0--1 range
    (3) Input image(left eye, right eye, face) will seperately subtract their mean image, so some values may be below zero, we call left_eye_sub_mean_img, right_eye_sub_mean_img, face_sub_mean_img
    (4) So left_eye_sub_mean_img, right_eye_sub_mean_img, face_sub_mean_img and face_mask will be as input images to inference the two values.

I wonder what I describe is right? or maybe I miss something?
Please check for me! Thank you very much!
Best Regards

about SubtractMean

When I test the model with a new picture, a SubtractMean is applied after the picture transferred to Tensor.
MeanImg refers to the mean of the training set or the mean of this picture?

Data split

Hi,

You guys mentioned the the data is split by patient. Where can I get the patients ids used in train/validation/test set?

Model for real-time inference

Hi,
First of all thank you for making your dataset and code available to the public!
We would like to replicate your model for real-time inference (Section 4.2 in the paper). Is the precise network layout / pre-trained model available somewhere?

Thanks in advance,
Tobias

Regarding Face Grid

Hey,
May i please know what are the FrameW and FrameH arguments, are they the original frame width and height(480640) or is it the Resized value(224224).

How The faceGrid width and height values are same in json for ex. 13*13??

Thanks,
Madan

Matlab Compatibility Code in prepareDataset.py

In prepareDataset.py, I encounter the following code section -

faceBbox = bboxFromJson(appleFace) + [-1,-1,1,1] # for compatibility with matlab code

I understand that after reading the values from appleFace.json as int, the X & Y pixel coordinates are treated as 1-indexed values (which is Matlab compatible). So, for converting it to 0-indexed in Python, we should add [-1,-1,0,0] to [X,Y,W,H]. But in the code, [-1,-1,1,1] is added (which will increase the width & height of face crops by 1 pixel).
Can you please clarify the reason why 1 is added to W & H? I know that increment of 1 pixel wouldn't matter much, but I'd like to get clarified.
Also, for leftEyeBbox & rightEyeBbox, [-1,-1,0,0] should be added instead of [0,-1,0,0], according to me.

Thanks.

I can not download the data

I have register an account on the website, but I can not sign in, and I can not get the GazeCapture Dataset

Cannot download the data

I registered to the website and verified my institutional email and yet cannot login to download the data. Please advise if I am missing something.

Inconsistency between described and actual model?

I am currently going through the pytorch eye model code and stumbled across an inconsistency that I suspect is being left out in the article on purpose, which just needs confirming. The described model does not mention max pooling but is being used between each layer in the code.

The described model is as follows:

The output is the distance, in centimeters, from the camera. CONV rep-
resents convolutional layers (with filter size/number of kernels: CONV-E1,CONV-F1: 11 × 11/96, CONV-E2,CONV-F2:
5 × 5/256, CONV-E3,CONV-F3: 3 × 3/384, CONV-E4,CONV-F4: 1 × 1/64)

but a max pool later can be seen between each layer in there code:

class ItrackerImageModel(nn.Module):
    # Used for both eyes (with shared weights) and the face (with unqiue weights)
    def __init__(self):
        super(ItrackerImageModel, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=0),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.CrossMapLRN2d(size=5, alpha=0.0001, beta=0.75, k=1.0),
            nn.Conv2d(96, 256, kernel_size=5, stride=1, padding=2, groups=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.CrossMapLRN2d(size=5, alpha=0.0001, beta=0.75, k=1.0),
            nn.Conv2d(256, 384, kernel_size=3, stride=1, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 64, kernel_size=1, stride=1, padding=0),
            nn.ReLU(inplace=True),
       )

2018-10-3010 59 19

Compatibility with latest iPhones

Hi and thank you for the awesome work you have done here.

One question regarding the inference on new generation iPhones. I want to infer the model on an iPhone XR the main differences between this model and the previous version (used for the training) are the position of the camera (The camera is almost part of the screen now), and the size of the screen (6.06 inches diagonally for the XR vs 4.6 inches diagonally for the 6 ). As a result, the output of the model is always underestimated.

Is calibration the only answer to this issue? Or can we apply a kind of transformation to the output based on the iPhone size ? How would you tackle this problem?

Many thanks

how to inferece use your model on my own video or image?

hi,
firstly, thanks your grate work!
but when I use your pytorch model I have some question:
(1)which is pre-trained model to is useable?
(2)if I want to inference on my own image or video, how it can be work?

thanks very much!

data processing

Hi,I would like to know if you have any human eye and human eye image preprocessing, thank you for your answer!

Unable to reproduce the result

I am using the given pretrained caffe model but getting a euclidean loss of much more than mentioned in the paper.Please look into my code and tell me where I am making mistake.
loading caffe model ,doing a forward pass to get the output.
CaffeModel.zip

I/O Error(Errno 5) when running prepareDataset.py

  • Solution: Check down my last comment.It was caused by a fault in my hardware.

  • I am facing an I/O error "OSError: [Errno 5] Input/output error" when running
    prepareDataset.py

  • I found in internet that I can redirect the output using >/dev/null 2>&1 after
    the command,but it doesnt create all subdirectories

python prepareDataset.py --dataset_path [A = where extracted] --output_path [B = where to save new data]
  • Is this command running fine at your system? I am using Ubuntu 16.04 LTS

  • I write below the error backtrace:
    Traceback (most recent call last):
    File "prepareDataset.py", line 273, in
    main()
    File "prepareDataset.py", line 125, in main
    img = np.array(img.convert('RGB'))
    File "/home/user/anaconda3/envs/myenv/lib/python3.6/site-packages/PIL/Image.py", line 934, in convert
    self.load()
    File "/home/user/anaconda3/envs/myenv/lib/python3.6/site-packages/PIL/ImageFile.py", line 234, in load
    s = read(self.decodermaxblock)
    File "/home/user/anaconda3/envs/myenv/lib/python3.6/site-packages/PIL/JpegImagePlugin.py", line 398, in load_read
    s = self.fp.read(read_bytes)

training stuck problem

Hi, thanks for sharing code and dataset.
I have some problem in training
I just run prepareDataset.py and then run the main.py
when running main.py, it stuck in the line 161:enumerate(train_loader)
when running that, the program runs out of my RAM by executing dozens of python and then the program get stucked, even not showing warning message, so I am sure that it is not memory error
My OS is windows and I use anaconda power shell, CUDA version is 10.1
How to fix that problem?

Relative change in the results does not follow relative change in eye movement

Hello! Thanks for your hard work!

Recently i've been trying to use the caffe model but since i want to try to use this fully on CPU anyways and i dont want to try to install caffe as a first option, i opted to load the caffe model through openCV's DNN module instead.

I cant see how the caffe model is being used in the repo so i tried to implement the sampe pipeline as the pytorch one. But unfortunately, i have met with strange results..

i stared straight at the camera, but it's way off (6 cm horizontal and 2 cm vertical). I tried looking left and right but seemingly, they yield no effect. The values does not vary according to the general direction of my eyes (as in, the relative changes), so i want to ask if my pipeline is correct?

  1. Get face
  2. Get eyes
  3. crop face and eyes and create grid (grid is all 1's where the face is within the grid)
  4. I resized the face and eyes so they are warped
  5. divide face and eyes images by 255
  6. load the means, and divide them by 255
  7. substract the means from their respective images
  8. resize them and shove them inside the model
  9. get results

Also i used some assumptions i observed and i confirmed from the pytorch code

  1. RGB image
  2. Right eye is the eye detected on the left and not the right side of the image

As a background, im using it on my laptop but i noticed this repo succeeds with usage not in a mobile device so what im asking is, what is the expected face distance from the camera?

Thanks!

Is there any visualization code or APP?

This work is great and interesting!
I have run the pytorch code in Linux but I didn't find any visualization. Is there any code or APP in Windows/Linux/IOS/Android so that I can have an intuitive experience.
Thanks!

Dataset is access denied

I am not to open the url http://gazecapture.csail.mit.edu/download.php to download the dataset.

Anyone had luck with pytorch inference?

Has anyone achieved accurate predictions using their own data with the pytorch implementation? I am getting inaccurate predictions after just changing the data loading paths in ITrackerData.py and using the checkpoint.

Face Grid Arguments

Hey,
I wanted to know what you guys are passing as arguments to get face grid values. Is Frame W/H is original image size or what values are you passing??. is Grid W/H fixed grid size 25*25?? is labelface x.y.w.h are face detection values?? Please let me know. I am stuck and perplexed in this. Thank you.

nan while evaluating

i am getting nan while evaluating it on the mpiigaze dataset. I am using pytorch for implementation.

Performance problems in later pytorch versions

Hi, is there a known reason why the Pytorch version chosen is 0.4.1?

It seems that later versions of pytorch take ~20x longer in the computation of gradients (back-propagation). I wonder if this is a known issue and the main reason why this version of torch was chosen. I encountered this behavior because I need a later version of pytorch to get some extra features.

is it likely to run with latest pytorch on ngc cloud pytorch docker container?

Steps to reproduce the issue:

mkdir database && cd database
wget -O gazecapture.tar "https://gazecapture.csail.mit.edu/dataset.php?

tar -xvf gazecapture.tar
cat *.tar.gz | tar zxvf - -i

  • start the container
    docker run --gpus all -it --rm -v /home/user/GazeCapture:/mount nvcr.io/nvidia/pytorch:20.02-py3
    enter the folder and execute the following steps
    cd /mount
    python prepareDataset.py --dataset_path base/--output_path output
    python main.py --data_path output --reset

Eye Bounding Box

Hi,
As far as I know the iOS face/eye detection service does not provide bounding boxes for eyes, only eye and eyebrow control points (landmarks). What was the procedure/algorithm that you used to generate bounding boxes for eyes from the provided information?
Thanks, Botond

Size of FaceGrid's Content

Given that the input image is a rectangle and input face is square, and face grid is calculated from these two images. However, when I view the facegrid data, I find that the height and width are the same, for example 14 x 14. How to generate face grid with square shape content?

checkpoint file doesn't work

I try to load the GazeCapture dataset and use the checkpoint file to test, which sad it can reach L2 error of 2.46cm. But the checkpoint file totally not work and reach the L2 error of about 25cm, I don't know where my problem is, can anybody help? Thanks!!!

Inference on webcam

Hello. Thank you for sharing youor code.

I'm currently is trying to launch your pytorch code on webcam. As i understand, i need to firstly detect face and both eyes on the frame and then launch model on that data and i can put anything as y-data since i'm only want to evaluate, not train. But one question still remains - how to get faceGrid? What this array contains and is it possible to get it somehow?

Division by 255 in SubtractMean

Hii...
Sorry to bother with a small issue - The ITrackerData.py file was revised on 26 Jan 2019, where division by 255 was introduced in SubtractMean class. Can you please specify the reason why this change was introduced? I mean, the code worked fine earlier without division with 255. What changed in due course of time that this change was introduced?

Thanks

Laptop Gaze Inference

Can I use the models for predicting gaze point on the laptop screen.

I am working on a project to track the gaze to move the mouse pointer around the screen and want to know if I can use your models to predict this gaze from a laptop's builtin webcam.

Thank you.

Calculating faceGrid.json from appleFace.json

There were already 3 issues raised regarding faceGrid, but none of them resolved my issue, which is the following -
How is [xLo, yLo, w, h] calculated for faceGrid.json from given face bounding box [X,Y,W,H] in appleFace.json?
E.g. - For recording 00002 & frame 00000.jpg -
[frameW, frameH] = [480, 640]
scaleX = 25/480 = 0.052
scaleY = 25/640 = 0.039
face bounding box [X,Y,W,H] = [38.15, 230.04, 343.68, 343.67]

Now, according to the following code snippet -

% Use one-based image coordinates.
xLo = round(labelFaceX(i) * scaleX) + 1;
yLo = round(labelFaceY(i) * scaleY) + 1;
w = round(labelFaceW(i) * scaleX);
h = round(labelFaceH(i) * scaleY);
if parameterized
labelFaceGrid(i, :) = [xLo yLo w h];
else

xLo = round(38.15 x 0.052) + 1 = 3
yLo = round(230.04 x 0.039) + 1 = 10
w = round(343.68 x 0.052) = 18
h = round(343.67 x 0.039) = 13
i.e. [xLo, yLo, w, h] = [3, 10, 18, 13]
but in faceGrid.json, the corresponding value given is [6, 10, 13, 13].

Why is there this significant difference in faceGrid.json values? Are these values calculated by using above formulae, or some other formulae? Also, I'm beginning to suspect that the faceGrid.json might have been obtained independently & not by using some formula on appleFace.json. Please clarify..

Thanks

pytorch pre-trained model

Hi,
I have some questions. Is the checkpoint.pth.tar the pre-trained model used in the publish work?
It's shown that caffe pre-trained model is provided, is it possible that you can provide us the pytorch version? Is there a way to convert the caffe model to pytorch model? (we can't find a reliable tool)
Thanks >v<

Strategy to Crop Face and Eyes

Hello, thank you for your public source code and dataset. I want to use the model in android phone to control an application by eye movement. Thus, I need to know how did the face and eye parts are cropped in the dataset so that I can feed new samples cropped in android to the model.

Unable to Login

I am unable to login and download the dataset. After creating an account, I tried logging in and access was denied.

how to get labels

sorry to bother you,could you please describe more details to get labels, Thank you so much!!

Run time Error in Reading Kernel Image

After Getting the metadata i am running it with main.py but when i do initially i get the warning "Found GPU0 Quadro K1100M which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5." and then i get the run time error "CUDA Cannot read kernel image". I am not able to relate this problem??. I am running through anaconda Windows 10, CUDA 9.0 and pytorch 0.4.1. Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.