Comments (5)
Hi,
We updated the training results for ViTPose-B, ViTPose-L, and ViTPose-H without images from the CrowdPose dataset. Due to the reduction of duplicate images and in the number of training images, the removal of the CrowdPose data decreased the AP on the MS COCO valuation set by about 0.4. Since there are also unique images in the CrowdPose training set, it is difficult to determine how much of an impact the duplicate images have for now. For the evaluation of the OCHuman dataset, the ViTPose variant experienced an AP drop of 0.2 to 0.6. This suggests that occlusion scenes in the training data can help ViTPose generalize better, although the performance is SOTA without these training data. Besides, the performance is almost the same on both MPII and AIC datasets, regardless of whether CrowdPose is used for training. We suspect that heavy occlusion scenarios are not common in these two datasets and ViTPose can generalize well in these cases.
Thanks again for your interest, we are trying to figure out the actual impact of duplicate images. It' s rather a challenging problem, so if there are other better ideas to solve this issue, please feel free to contact us.
Best,
from vitpose.
It is indeed a challenging problem to reduce the overlap between two datasets. Thus we will remove the CrowdPose-related joint training results at this moment. If the authors of Crowdpose supply the source information of images, we can easily investigate the source of the performance gains caused by CrowPose. Alternatively, we may consider other techniques like local descriptors to match the images in two datasets. However, this still needs human efforts to guarantee match quality. You’re very welcome if you have other better ideas to solve this issue. Please feel free to contact us!
from vitpose.
Thanks for your comments. As demonstrated in Table 6, the performance gains brought by crowdpose is relatively small compared with the performance gains brought by AIC. We suspect that the usage of AIC dataset is more important in the multiple dataset setting and the overlap between COCO and crowdpose is rather small. The results without Crowdpose datasets are already SOTA, which does not affect the conclusion. Besides, the annotations for COCO and Crowdpose datasets are different, and the images from Crowdpose datasets are not processed with the MS COCO head. We will remove these replicated images and re examine the sources of the performance gains. What's more, according to Table 8, the single task results of ViTPose variants with MS COCO are already SOTA. The multiple datasets results are only used to demonstrate the flexiblity of the proposed ViTPose. The biggest model, i.e., ViTPose-G, is trained with MS COCO and AIC only and obtains the 81.0 performance on the MS COCO test set.
Thanks again. We will exclude the influence of Crowdpose and retrain the models as soon as possible. Please stay tuned.
from vitpose.
Thanks for the prompt reply! And sorry for being vague in my previous comment; I believe the results in the paper, except those with CrowdPose, are solid and we have also verified some of them ourselves. Looking forward to the updated version of the results.
Regarding your comment "We will remove these replicated images and re examine the sources of the performance gains.", We only checked the overlap issue with md5, and this doesn't work on AIC/MPII (no duplicated md5 info, even for training set images, so it seems all image files are altered), given the situation on COCO, It's likely that there are also test images of AIC/MPII in Crowdpose train set, but since all image files are renamed and some altered in bit level, It's hard to identify all duplications. We have contacted the authors of Crowdpose on this issue, but they haven't replied yet. So I personally think Crowdpose is not suited for joint training with COCO/AIC/MPII at this point.
from vitpose.
It seems there are no further questions. I will close this issue temporarily. If you have any more questions, please feel free to re-open it.
from vitpose.
Related Issues (20)
- ViTPose fails to train on a small dataset
- Please install mmdet to run the demo. HOT 1
- how to use ViTPose+ in MMPose HOT 1
- The checkpoints of "whole body" are wrong. HOT 5
- Train on custom dataset - number of keypoints HOT 1
- memory usage
- To those who just want to easily inference an image
- Can't run the single machine experiment HOT 2
- Wonder how to run the demo with cuda 11.8
- MS COCO VitPose-S simple config file issue
- APT-36k train valid(test) split
- Support for Selective Person Tracking in Multi-Person Videos
- which config file can train COCO+AIC+MPII dataset?
- Questions about table5 in the paper
- Issue about the padding when building the PatchEmbedding layer HOT 1
- the config about ViTPose small_simple
- where is smpl_mean_params.npz
- Question about parameter "max_num_joints" HOT 1
- Inconsistency in Reported Experimental Results for ViTPose and ViTPose++ Across Papers HOT 2
- Assertion Error During Testing HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vitpose.