Git Product home page Git Product logo

aic2022-ver's Introduction

AIC2022-Video-Event-Retrieval

This repo contains the code and data for our project, which was accepted at CVPRW 2022. Our project is a new approach to a natural language-based vehicle retrieval task. paper

For reproducibility, we also provide a colab notebook notebook that contains the code for reproducing the results.

Development environment

Before using this repo, please use the environment setup as below.

Pre-installation

Install conda according to the instructions on the homepage Before installing the repo, we need to install the CUDA driver version >=10.2.

$ conda env create -f environment.yml
$ conda activate hcmus
$ pip install -r requirements.txt
$ pip install -e .

Prepare data

Create a symbolic link to the data directory in the data directory of the project.

$ cd /Users/your_short_username/path/to/where/you/want/to/put/the/symlink
$ ln -s /Volumes/HDD_name/path/to/where/you/are/storing/the/moved/files    symbolic_link_name_you_want_to_use

Ensure your data folder structure as same as our data_sample before running the code.

$ ./tools/extract_vdo2frms_AIC.sh ./data/AIC22_Track2_NL_Retrieval/ ./data/meta/extracted_frames/
$ cp ./data/AIC22_Track2_NL_Retrieval/*.json ./data/meta/
$ ./tools/preproc_motion.sh ./data/meta
$ ./tools/preproc_srl.sh ./data/meta

For detail, please take a look at extract data notebook

For testing purpose, you can use the command above with data_dir is ./data_sample/meta

Reading detail document of preprocessing part can be found in the srl part and basic part (adapted from hcmus team and alibaba team source code).

Inference

We provide a simple inference script for inference purpose. With artifacts/ is the directory where you store the trained classification model.

$ ./tools/infer.sh ./data/meta/

For detail, please take a look at Predictor class in src/predictor.py or inference notebook

Training

Updating

Deployment (not working yet)

For deployment/training purpose, docker is an ready-to-use solution.

To build docker image:

$ cd <this-repo>
$ DOCKER_BUILDKIT=1 docker build -t aic22:latest .

To start docker container:

$ docker run --rm --name aic-t2 --gpus device=0 --shm-size 16G -it -v $(pwd)/:/home/workspace/src/ aic22:latest /bin/bash

With device is the GPU device number, and shm-size is the shared memory size (should be larger than the size of the model).

To attach to the container:

$ docker attach aic-t2

Contribution guide

If you want to contribute to this repo, please follow steps below:

  1. Fork your own version from this repository
  2. Checkout to another branch, e.g. fix-loss, add-feat.
  3. Make changes/Add features/Fix bugs
  4. Add test cases in the test folder and run them to make sure they are all passed (see below)
  5. Create and describe feature/bugfix in the PR description (or create new document)
  6. Push the commit(s) to your own repository
  7. Create a pull request on this repository
pip install pytest
python -m pytest tests/

Expected result:

============================== test session starts ===============================
platform darwin -- Python 3.7.12, pytest-7.1.1, pluggy-1.0.0
rootdir: /Users/nhtlong/workspace/aic/aic2022
collected 10 items

tests/test_args.py ...                                                     [ 30%]
tests/test_utils.py .                                                      [ 40%]
tests/uts/test_dataset.py .                                                [ 50%]
tests/uts/test_eval.py .                                                   [ 60%]
tests/uts/test_extractor.py ...                                            [ 90%]
tests/uts/test_model.py .                                                  [100%]

aic2022-ver's People

Contributors

kaylode avatar nhtlongcs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

huypl53

aic2022-ver's Issues

Add AIC metrics

  • Add competition metric (MRR)
  • Create abstract metric class for retrieval purpose (as good as possible)
  • Add new auto tests

Fix metric

The implement metric only calculates the results on 1 batch and averages.

We need to call metric.calculate at the on_eval_end step to perform the search on the entire dataset. Currently, the similarity search method only works in 1 batch, so the top 5 is quite high (compared to the random percent of 5/8). Please correct by append embedding to the list on on_val_step model hook

Sort uuids for label consistency when resume training

  • The uuids might be changed when resuming training, which can hurt the performance of model (affect the instance loss)

  • Part of code that produces this error:

class CityFlowNLDataset(Dataset):
      ....
      self.list_of_uuids = list(tracks.keys())  # the order can be different when resuming
      self.list_of_tracks = list(tracks.values())
      ...
      self.all_indexs = list(range(len(self.list_of_uuids)))
def __getitem__(self, index):
      tmp_index = self.all_indexs[index] # then this will be used as target for instance loss
      ...

Add relation graph extractor

  • This module extracts relation graph between tracks, the outputs determine which tracks follows or is_followed by other tracks.

Add stop turn detector

  • This module is for post process, helps filter out retrieval prediction
  • Should be used on test tracks to determine which track has turn or stop actions

Add inference script

Suggestions:

  • Add 2 dataset/dataloader for query text features and visual features (motion + crop).

  • Add a new class for inference (nn.Module or pl.LightningModule), which can load in checkpoint and has two functions: encoding texts and encoding visual

  • Finally, use Faiss to calculate similarity, save results to file for submission and visualization

  • Small example snippet

@torch.no_grad()
def inference(self):
    self.model.eval()

    # Extract lang feats
    lang_results = {}
    for idx, batch in enumerate(self.lang_dataloader):
        batch = move_to(batch, self.device)
        lang_feats = self.model.encode_nlang_feats(batch)
        lang_feats = move_to(lang_feats, torch.device('cpu')).detach().numpy()
        ids = batch['ids']
        for lang_id, lang_feat in zip(ids, lang_feats):
            lang_results[lang_id] = lang_feat.tolist()
        
    with open(osp.join(self.savedir, 'text_embeds.json'), 'w') as f:
        json.dump(lang_results, f)

    # Extract visual feats
    visual_results = {}
    for idx, batch in enumerate(self.visual_dataloader):
        batch = move_to(batch, self.device)
        visual_feats = self.model.encode_visual_feats(batch, inference=True)
        visual_feats = move_to(visual_feats, torch.device('cpu'))
        visual_feats = visual_feats.detach().numpy()
        ids = batch['ids']
        for visual_id, visual_feat in zip(ids, visual_feats):
            visual_results[visual_id] = visual_feat.tolist()
        
    with open(osp.join(self.savedir, 'visual_embeds.json'), 'w') as f:
        json.dump(visual_results, f)

    # Faiss retrieval
    retriever = FaissRetrieval(dimension=self.dimension)
    query_embeddings = np.stack(lang_results.values(), axis=0).astype(np.float32)
    gallery_embeddings = np.stack(visual_results.values(), axis=0).astype(np.float32)
    query_ids = list(lang_results.keys())
    gallery_ids = list(visual_results.keys())

    retriever.similarity_search(
        query_embeddings,
        gallery_embeddings,
        query_ids,
        gallery_ids,
        top_k=self.top_k,
        save_results=osp.join(self.savedir, 'retrieval_results.json'))

Update logging (vis)

  • Add logging callback (support visualize query - tracklet/image)
  • Usage of logging callback

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.