Comments (28)
Hi @nitheeshas,
do you want C++ example, MATLAB or Python?
from clandmark.
Thanks for replying. I would like to see an example in C++.
from clandmark.
Ok, I will skip the face detector part, since I now have code only using a commercial one. However, it is possible to use combination of OpenCV haarcascades for frontal and profile faces.
Lets assume that we have the bbox of the face in the image (the format of bbox is described e.g. here: #20 ). The function which jointly detects the discretized yaw angle and landmarks looks like this:
void jointmv_detector(Flandmark **flandmarkPool, int *bbox, int *viewID)
{
const int PHIS = 5;
fl_double_t scores[PHIS];
fl_double_t maximum = -INFINITY;
for (int phi=0; phi < PHIS; ++phi)
{
Flandmark *flandmark = flandmarkPool[phi];
flandmark->detect_optimizedFromPool(bbox);
// compute score
scores[phi] = flandmark->getScore();
if (scores[phi] > maximum)
{
maximum = scores[phi];
*viewID = phi;
}
}
}
the viewID
serves as a pointer to flandmarkPool
, so we can later extract landmarks and view label.
Now how to initialize the flandmarkPool
. Lets assume we have a following .txt file:
./models/PART_fixed_JOINTMV_-PROFILE.xml
./models/PART_fixed_JOINTMV_-HALF-PROFILE.xml
./models/PART_fixed_JOINTMV_FRONTAL.xml
./models/PART_fixed_JOINTMV_HALF-PROFILE.xml
./models/PART_fixed_JOINTMV_PROFILE.xml
Then we can use the following function to parse it
std::vector<std::string> readModelList(const char *file)
{
std::vector<std::string> out;
std::ifstream infile;
infile.open(file);
std::string line;
while (std::getline(infile, line))
{
out.push_back(line);
}
return out;
}
So, in the main
function we can use it as follows:
// read models from a text file
std::vector<std::string> models = readModelList(argv[3]);
Flandmark *flandmarkPool[models.size()]; // pool of Flandmark instances
for (int i=0; i < models.size(); ++i)
{
flandmarkPool[i] = Flandmark::getInstanceOf(models[i].c_str());
if (!flandmarkPool[i])
{
cerr << "Couldn't create instance of flandmark with model " << models[i] << endl;
return -1;
}
}
tim = timer.toc();
const int * bw_size = flandmarkPool[0]->getBaseWindowSize();
CFeaturePool * featuresPool = new CFeaturePool(bw_size[0], bw_size[1]);
featuresPool->addFeaturesToPool(
new CSparseLBPFeatures(featuresPool->getWidth(),
featuresPool->getHeight(),
featuresPool->getPyramidLevels(),
featuresPool->getCumulativeWidths()
)
);
for (unsigned int i=0; i < models.size(); ++i)
{
flandmarkPool[i]->setNFfeaturesPool(featuresPool);
}
this initializes flandmarkPool (view dependent instances of Flandmark with corresponding models loaded) and featuresPool (the helper structure, which shares precomputed features among Flandmark instances).
Prior calling jointmv_detector
function, do not forget to do this
featuresPool->updateNFmipmap(featuresPool->getWidth(), featuresPool->getHeight(), flandmarkPool[0]->getNF(frm_gray, &bbox[0])->data());
where cimg_library::CImg* frm_gray
is supposed to be filled with the grayscale input image. This initiates feature computation in featuresPool
a necessary step for the function jointmv_detector
to work properly.
from clandmark.
Thanks a lot for the detailed explanation!
I had doubt regarding the face detector too for multi view since opencv's profile face detector gave only an average result. I saw in the website that you were using Eyedea face detector. Their face detection seems to be almost perfect.
Anyway, I'll try this out right away. Thanks again!
from clandmark.
Yeah, Eyedea face detector is performing really well. It implements this paper, if you would like to re-implement it.
I guess another option is to re-train the OpenCV profile detector.
from clandmark.
Wow, waldboost? Its actually already implemented by someone. Its available in opencv-contrib. I'll try to train it and check how good it performs.
from clandmark.
Hi @uricamic
I was able to build the multi view landmark extraction using Dlib's face detector. I used the jointly learned landmarks pool. But the extracted landmarks are not that proper. Is it a known problem?
from clandmark.
Hi @nitheeshas,
the models currently available are learned on a very limited training set. We are currently learning them on a bigger database.
It is also possible that since the search spaces are shrinked (in order to get the detector as fast as possible), the dlib's face detector should be corrected to match the expectations for the face detector used in training.
Hard to tell without seeing some examples, though.
from clandmark.
I've uploaded an example demo video of the outputs I got. Please check.
https://www.youtube.com/watch?v=25dbq7KSLsI
Sorry for the poor quality!
from clandmark.
It seems that the face detection is really suffering a huge variance in scale and position. On the other hand, when it is as one would expect, it looks quite nice, I would say.
One quick suggestion which should improve the accuracy a lot is to stabilize the face detector output by e.g. Kalman filtering.
The new models should also improve the quality a lot, however, they are not yet fully learned.
from clandmark.
Yes, I just started modifying the code for Kalman filter now. Will update how it works :)
from clandmark.
@uricamic I was not able to add kalman filter since i got caught up with some other work.
But the thing is, while testing the previous output, even when i was standing perfectly still, and the face detector output was also pretty much constant, the detected landmarks kept jumping a lot.
Maybe the best solution for this problem is to build fully learned models as you said. Are you still working on creating better models?
from clandmark.
@nitheeshas, I think in such case the problem is with a noise in the webcam input. The new models should help a bit, but depending on how severe the noise is.
New version should be learned within few days, the biggest benefit should be the better yaw estimation precision and I hope to some extent also the landmark localization accuracy. However, the accuracy is limited due to relatively small normalized frame. The idea is to have this detector as an initial phase and then for precise landmark detection or tracking use better model (either with increased size of normalized frame or using regression, to remove the systematic error introduced by transforming landmarks from the normalized frame back to image).
from clandmark.
In that case, it will be better to wait for the new learned model and if its still shaky, will add a Kalman filter and check again.
Hope you'll update soon.
from clandmark.
Hi @uricamic Can you share the dataset which you are using to train the multi view landmark detection?
from clandmark.
Hi @nitheeshas,
we are still working on that. Maybe, some smaller portion of examples could be published soon. Sorry for the delay.
from clandmark.
@uricamic : A small question:
You suggest adding a call to updateNFmipmap
prior to detect_optimized
.
But the latter function already includes a call to updateNFmipmap
. And the static_input
example you supply does not have that independent call to updateNFmipmap
and yet it seems to give good results nonetheless.
Also, since the CSparseLBPFeatures
class inside the CFeaturePool
is protected, this cannot be done on an image-by-image basis, but only during the CFeaturePool
initialization. If this is indeed a critical stage, then you should add an init_CSparseLBPFeatures
function to the CFeaturePool
class.
from clandmark.
Hi @mousomer,
I think there is some misunderstanding. The updateNFmipmap
method of CFeaturePool
is needed if you want to call the detection on multiple images. You simply exhange the image on which the detection is performed. Without costly re-initialization of the objects. Btw, the static_input
example is also using it (see here).
The features are computed automatically inside the CFeaturePool
class, when you call this updateNFmipmap
method (see here), user is not supposed to interfere in features computation anyhow.
Maybe the names of some methods are a bit confusing, I am sorry if it is the case. However, all the important functionality is there and working. Some methods are there also just because of the MATLAB interface, especially for the purpose of the model learning, where the speed is very important.
from clandmark.
Well, you pointed into the detect_optimized
function, which I suppose is the main API for extracting the features. So, am I correct in understanding that I don't need an extra call to updateFmipmap
before I call detect_optimized
?
from clandmark.
Hi @mousomer,
yes, for detect_optimized
you really do not need to do that extra call of updateFmipmap
.
However, check the post, where I was suggesting this call. It was for the jointmv_detector
, which is internally calling detect_optimizedFromPool
. Then, you have to call updateFmipmap
prior calling the detector, because in that case there is no other way how to update the image and let the features to be computed. The reason why it is so is simply because there are multiple detectors to run, and the landmarks of the detector which has the maximal response are returned. The features are computed just once per face image and used in all detectors.
from clandmark.
I see. Thanks!
Oh, and if I haven't mentioned it before - this package is really awesome.
from clandmark.
@mousomer
No problem, it is always good to ask questions ;-)
Thanks!
from clandmark.
I re-run my sample set with optimizedFromPool
instead of detect_optimized
.
Got exactly the same results.
And the score is always biased towards the NegativeHalfProfile.
I've run a few thousand examples. This is the statistics I'm seeing:
Frontal | NegProf | NefHProf | PosHProf | PosProf | |
---|---|---|---|---|---|
Mean score | 1.076 | 1.363 | 2.995 | 1.818 | -1.523 |
StdDev score | 0.064 | 0.046 | 0.065 | 0.062 | 0.049 |
from clandmark.
Hi @mousomer,
thank you for reporting this. The values you show seem to be a bit suspicious, I would expect highest score for the frontal views, since those have the highest number of landmarks.
Maybe there is a bias term missing in the code sample. I will check it soon and come back with an answer.
from clandmark.
@uricamic I've tested a few of the images. Seems that when translating scores to Z-score (subtracting means, dividing by standard deviation), the best z-score does yield the best model match. I need to verify this on a large batch of images.
from clandmark.
@uricamic
I did testing with NIST face set 18 - which has right and left face profiles:
https://catalog.data.gov/dataset/nist-mugshot-identification-database-mid-nist-special-database-18
The scoring is still bad. Even when translating into z-scores or tail scores, the results are not good.
So, basically, I need some external reference software to decide on the right model (frontal, R/L profile or half profile).
from clandmark.
Hi @mousomer,
no z-score translation should be needed. I will try to check on the database you mention and share the code with you. I hope I can manage it within a week, cannot guarantee that though.
from clandmark.
from clandmark.
Related Issues (20)
- Running Clandmark webcam example HOT 15
- Truecaller_api
- Tru
- Truecaller_api HOT 1
- matlab interface HOT 2
- matlab interface HOT 5
- missing -lopencv_imgcodecs for libopencv 3.1.0 HOT 3
- Using code in video processing,how can I speed up it? HOT 2
- Error in `./video_input': double free or corruption (out): 0x0000000000b56020 ***Aborted (core dumped) HOT 1
- Face Rectangle size HOT 11
- Relocation error when compiling Python interface on Ubuntu HOT 23
- cannot openfile CDPM.xml HOT 9
- Building python_binding HOT 2
- Ask some questions for using "C2F-DPM.xml" HOT 5
- How to build a new face model based on Rhesus monkey ?
- Matlab interface on linux
- Can I use this lib to detect smile ? HOT 2
- Using clandmark in Android app with native c++ HOT 6
- Question for unstable landmark points according to new input frames.
- Multi-view face landmark extraction in Python HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clandmark.