Comments (11)
Thank you for the quick answer.
Besides the convergence question, I still have the doubt about the 0.5.
In the paper I understood that the hint was indicating:
- 1 -> known original value
- 0 -> known imputed value
- 0.5 -> unknown
And the discriminator has to define if the 0.5 is an original or an inputed value.
But in this implementation, the hint shows:
- 1 -> known original value
- 0 -> unknown
So the hint is only helping in the known original values, but giving no hint about the missing values?
from gain.
In practice, providing 90% of the mask vector as the hints make the best performance. (Hint is only given to the known features)
In theory (in the paper), providing one feature as a hint converges to the optimal solution with MCAR setting.
from gain.
Yes.
In this code, the hint is only provided to the known variables.
Therefore, the discriminator has to determine if the 0 is an original or an imputed value.
We don't provide the imputed variables as the hint; therefore, we don't need to introduce 0.5 here.
Thanks.
from gain.
I have tested two types of hint : original paper vs. this code
and I found out that the performance of these two models were almost the same.
Although the variance of MSE test loss designed by the original paper(using 0.5 for hint) was bit higher, it didn't seem that meaningful.
from gain.
Usually, on missing completely at random setting, hint does not have a big impact on the results.
from gain.
1.The Imputed Matrix is equal to the Hat_New_X?
2.when i try to print the Hat_New_X, I find that some 0 positions are not imputed ,Is it 0 in the original data?
look forward to your reply
from gain.
- Yes. G_sample is the output of the generator and Hat_New_X is the matrix that only missing values are replaced by G_sample.
- Yes. some of them have 0 as the original values.
Thanks!
from gain.
Thank you for the quick answer.
- Is the letter data in your codes containing missing values? And has been filled with 0.
- I can't compare the imputed data with the original dataset because there is no raw dataset
I recently wrote a paper to quote your paper to impute the data, but the effect is not ideal
from gain.
- No. The letter data is complete data.
- I introduce the missing in line 51-59 and 210.
- Please check those lines.
- The original raw data is always there that you can compare.
- Please see line 233 and 186.
from gain.
In the paper, Figure 1 shows, you feed three matrixes, including data matrix, random matrix, mask matrix, but I do not see you feeding random matrix to the generator. What is the random matrix?
from gain.
You can see how we use random matrix in this link (https://github.com/jsyoon0823/GAIN/blob/master/gain.py#L168-L169)
from gain.
Related Issues (20)
- How to decide Missingness Mechanism HOT 1
- Differences with the paper HOT 1
- Using GAIN in inductive mode HOT 1
- Changing only missing values? and scoring? HOT 1
- Why not both L_G and L_D relevant to V(D,G)? HOT 1
- Could you please provide Requirements.txt file HOT 1
- My dataset is 203454KB, I can't get the dataset after filling, because my dataset is too big? It gives some mistakes. HOT 1
- mixed (categorical and numerical) data HOT 3
- Model for the MNIST dataset HOT 1
- alpha HOT 1
- original data HOT 1
- Hyperparameters training HOT 3
- hyperparameters HOT 3
- RMSE is not stable HOT 1
- RMSE HOT 1
- Hint matrix HOT 1
- Why isn't the loss calculated only with b_i=0 values of the Hints. HOT 2
- No split training and testing sets? HOT 2
- Training Query HOT 1
- about minibatch HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gain.