dualplus / ltnet Goto Github PK

View Code? Open in Web Editor NEW

46.0 46.0 12.0 14.21 MB

Implement of LTNet in "Facial expression recognition with inconsistent datasets", ECCV 2018

C++ 59.39% Cuda 8.44% Python 31.14% Shell 1.03%

ltnet's People

Contributors

Stargazers

Watchers

Forkers

azuredsky mengdebin18 tjussh stovejunjun jojaii ffzhang1231 pilotbear guozhongluo qweadqw dangxusheng mateuszkolimaga keep-learning-cmd

ltnet's Issues

对论文中的confusion matrices 和 transition matrices 不懂

我不太懂这两行的区别，这里的latent truth 和 estimated truth是不是一个意思呀，或者说estimated truth是从哪得来的呀，是不是上面一行是coder预测的标签和latent truth标签的混淆矩阵，下面一行是数据集原来的标签和真实标签的混淆矩阵。不太明白，想向您请教请教~

论文中一些地方没看懂

之前做过一些表情的工作，表情这一块的内容中不同数据集标注标准不一致一直是一个很严重的问题，很开心看到了你的论文，但是有些地方看不太懂，关于caffe也知之甚少，所以在这里提一下问，希望能解开我的困惑。

论文第七页阐述完EM的问题之后，开始介绍LTN的结构，有这么一句话rather than minimizing the discrepancy between the estimated truths and the observed labels directly, LTNet predicts each coder’s annotation and minimizes the discrepancy between the predicted and observed annotations。
是否描述的是，LTN并不直接像其他模型一样对预测结果和label计算一个损失进行优化，而是针对一个图片（假设只有一个）输出每一个标注者的标注情况，然后最小化他们的损失。
换而言之，我理解的是，假如现在我一副图片有了三个标注者，模型A的pred，模型b的pred，以及图片本身的label，那么对这一幅图片而言，我输入它，通过LTN我得到的是三个输出预测值，分别对应于前面三个label，我计算每一个的损失，最终求和作为总的损失。

2.如果前面1没理解错，那么现在我们看LTN结构

这里有两个问题，一是basic network部分最后得到的latent truth layer是否是我们一般所说的卷积层最后一层的输出？图看上去很像，但是加上后面T的意义的话，我觉得这里应该是一个长度为L的全连接层的输出，这样才能在进行batchsizeL 和LL的矩阵乘法，二是probability transition layers是否可以理解成一个转移矩阵，类似T的定义所言，它代表的是真实i被标注为所有类别可能性的分布，那么这个矩阵是否是可学习的？如果可学习，那么为了保持其行归一性，训练过程中是不是也需要每训练一次就进行一次行归一化？直到训练结束。

3.如果前面我所理解的都没有太大问题，那么对于第三页的一幅图还有一点疑惑

我现在理解的流程如问题一所说，先训练两个模型，再将其预测作为标签，这样每个图片拥有了三个标签，分别在latent truth layer后有三个可训练的转移矩阵得到最后的对标注者的预测标签，分别计算损失然后求和，进行反响传播优化模型。
那么这样的话我对于假如Unlabeled数据有点不解，这样以来就导致部分数据最后输出的label有三组，部分有两组，遇到只有两组的图片时，最正常的想法就是只算这两组的损失进行优化，请问是这么处理的吗？

之所以疑惑是因为我觉得对于unlabel的数据缺少一个维度，感觉像是data层面的dropout，网络本身并不知道哪些数据最终算损失时会缺少一部分，所以这种奇怪的改动会带来什么影响我很好奇，因为换作是我可能会直接抛弃掉这些unlabel的数据，不知道有没有做过类似的实验，是否使用unlabel数据对最后结果的影响。我感觉这是一个很有价值的问题，因为如果能验证以某种方式使用unlabel图片会带来提升，那么对于存在大量的unlabel图片的问题来说绝对是一个好消息。

cifar_test_list.txt is missing?

Could you provide cifar_test_list.txt used in res20_cifar_test_org.prototxt?Thank you
And the imglist.txt in eval_cifar.py

FER pretrain

Hello, you mentioned in the readme for the folder of FERStuff that...

"The training prototxt that I used in the facial expression recognition. Note that the main body of the net is pretrained by a conventional FER task on the unition of RAF and AffectNet."

...but no caffemodel file for FER was included in the repo (and I don't think this was mentioned in the paper as well). Am I correct in my understanding that to achieve the mentioned accuracies in the paper, you
1.) pretrained the ResNet80 architecture first with the combined datasets RAF and AffectNet
2.) trained LTNet (with ResNet80 as base) with the multiple annotated data?

用模型做inference的时候应该取哪个blob？

做inference时，是否只需要用LTNet，不再需要前面的model_aff和model_raf？
这样的话是否取fc_emotion_7的softmax值作为每个label的预测概率，从而得到预测值？

运行错误

C:\Software\Anaconda3\envs\Caffe\lib\site-packages\skimage\transform_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
warn("The default mode, 'constant', will be changed to 'reflect' in "
Traceback (most recent call last):
File "eval_cifar10.py", line 93, in
display_validation_test_result(model_weight, device_no = device_no)
File "eval_cifar10.py", line 87, in display_validation_test_result
device_no = device_no)
File "eval_cifar10.py", line 47, in eval_validation
for line in freader:
ValueError: I/O operation on closed file.

deploy prototxt for FER classification

Good day. The deploy prototxt for the cifar classification has been shared but not the one for FER classification. Would it be possible for that to be uploaded as well? Many thanks.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.