Git Product home page Git Product logo

aic2021-t5-clv's Introduction

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

πŸ† The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

framework

We have two codebases. For the final submission, we conduct the feature ensemble, where features are from two codebases.

Part One is at here: https://github.com/ShuaiBai623/AIC2021-T5-CLV

Part Two is at here: https://github.com/layumi/NLP-AICity2021

Prepare

  • Preprocess the dataset to prepare frames, motion maps, NLP augmentation

scripts/extract_vdo_frms.py is a Python script that is used to extract frames.

scripts/get_motion_maps.py is a Python script that is used to get motion maps.

scripts/deal_nlpaug.py is a Python script that is used for NLP augmentation.

  • Download the pretrained models of Part One to checkpoints. The checkpoints can be found here. The best score of a single model on TestA is 0.1927 from motion_effb3_NOCLS_nlpaug_320.pth.

The directory structures in data and checkpoints are as follows:

.
β”œβ”€β”€ checkpoints
β”‚Β Β  β”œβ”€β”€ motion_effb2_1CLS_nlpaug_288.pth
β”‚Β Β  β”œβ”€β”€ motion_effb3_NOCLS_nlpaug_320.pth
β”‚Β Β  β”œβ”€β”€ motion_SE_3CLS_nonlpaug_288.pth
β”‚Β Β  β”œβ”€β”€ motion_SE_NOCLS_nlpaug_288.pth
β”‚Β Β  └── motion_SE_NOCLS_nonlpaug_288.pth
└── data
 Β Β  β”œβ”€β”€ AIC21_Track5_NL_Retrieval
    β”‚Β Β  β”œβ”€β”€ train
    β”‚Β Β  └── validation
 Β Β  β”œβ”€β”€ motion_map 
 Β Β  β”œβ”€β”€ test-queries.json
 Β Β  β”œβ”€β”€ test-queries_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
 Β Β  β”œβ”€β”€ test-tracks.json
  Β  β”œβ”€β”€ train.json
 Β Β  β”œβ”€β”€ train_nlpaug.json
 Β Β  β”œβ”€β”€ train-tracks.json
 Β Β  β”œβ”€β”€ train-tracks_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
 Β Β  β”œβ”€β”€ val.json
 Β Β  └── val_nlpaug.json             ## NLP augmentation (Refer to scripts/deal_nlpaug.py)

Part One

  • Modify the data paths in config.py

Train

The configuration files are in configs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py --name your_experiment_name --config your_config_file |tee log

Test

Change the RESTORE_FROM in your configuration file.

python -u test.py --config your_config_file

Extract the visual and text embeddings. The extracted embeddings can be found here.

python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nlpaug_288.yaml
python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_3CLS_nonlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nonlpaug_288.yaml

Part Two

Link

Submission

During the inference, we average all the frame features of the target in each track as track features, the embeddings of text descriptions are also averaged as the query features. The cosine distance is used for ranking as the final result.

  • Reproduce the best submission. ALL extracted embeddings are in the folder output:
python scripts/get_submit.py

Friend Links:

aic2021-t5-clv's People

Contributors

layumi avatar shuaibai623 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

aic2021-t5-clv's Issues

Error on running scripts/get_motion_maps.py

The error:
0it [00:00, ?it/s]
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/yaoy/anaconda3/envs/AIC2021-T5-CLV/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "scripts/get_motion_maps.py", line 32, in get_bk_map
avg_img = np.mean(np.stack(imgs),0)
File "<array_function internals>", line 6, in stack
File "/home/yaoy/anaconda3/envs/AIC2021-T5-CLV/lib/python3.7/site-packages/numpy/core/shape_base.py", line 423, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "scripts/get_motion_maps.py", line 66, in
for imgs in tqdm(pool.imap_unordered(get_bk_map, files)):
File "/home/yaoy/anaconda3/envs/AIC2021-T5-CLV/lib/python3.7/site-packages/tqdm/std.py", line 1180, in iter
for obj in iterable:
File "/home/yaoy/anaconda3/envs/AIC2021-T5-CLV/lib/python3.7/multiprocessing/pool.py", line 748, in next
raise value
ValueError: need at least one array to stack

In my understanding, this happens when some paths are incorrect. But I cannot figure out what is wrong. Any help is appreciated.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.