Git Product home page Git Product logo

emadeldeen24 / ca-tcc Goto Github PK

View Code? Open in Web Editor NEW
115.0 1.0 30.0 466 KB

[TPAMI 2023] Self-supervised Contrastive Representation Learning for Semi-supervised Time-Series Classification

Home Page: https://ieeexplore.ieee.org/document/10233092

Shell 2.24% Python 97.76%
classification contrastive-learning fine-tuning pseudo-label representation-learning self-supervised-learning semi-supervised-learning time-series transfer-learning unsupervised-learning

ca-tcc's Introduction

Self-supervised Contrastive Representation Learning for Semi-supervised Time-Series Classification (CA-TCC) [Paper] [Cite]

This work is an extention to TS-TCC, so if you need any details about the unsupervised pretraining and/or the datasets and its preprocessing, please check it first.

Training modes:

CA-TCC has two new training modes over TS-TCC

  • "gen_pseudo_labels": which generates pseudo labels from fine-tuned TS-TCC model. This mode assumes that you ran "ft_1per" mode first.
  • "SupCon": which performs supervised contrasting on pseudo-labeled data.

Note that "SupCon" is case-sensitive.

To fine-tune or linearly evaluate "SupCon" pretrained model, include it in the training mode. For example: "ft_1per" will fine-tune the TS-TCC pretrained model with 1% of labeled data. "ft_SupCon_1per" will fine-tune the CA-TCC pretrained model with 1% of labeled data. The same applies to "tl" or "train_linear".

To generate the 1%, you just need to split the data into 1%-99% and take the 1%. Also, you can find a script that does a similar job here. However, note that it creates it for 5-fold, so you can set it to just 1-fold.

Baselines:

The codes of the self- and semi-supervised learning baselines I used in the paper are HERE.

The codes of the self-supervised learning baselines I used in the paper can be found in my other work.

Training procedure

To run everything smoothly, we included ca_tcc_pipeline.sh file. You can simply use it.

Citation

If you found this work useful for you, please consider citing it.

@inproceedings{tstcc,
  title     = {Time-Series Representation Learning via Temporal and Contextual Contrasting},
  author    = {Eldele, Emadeldeen and Ragab, Mohamed and Chen, Zhenghua and Wu, Min and Kwoh, Chee Keong and Li, Xiaoli and Guan, Cuntai},
  booktitle = {Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, {IJCAI-21}},
  pages     = {2352--2359},
  year      = {2021},
}
@ARTICLE{catcc,
  author={Eldele, Emadeldeen and Ragab, Mohamed and Chen, Zhenghua and Wu, Min and Kwoh, Chee-Keong and Li, Xiaoli and Guan, Cuntai},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Self-Supervised Contrastive Representation Learning for Semi-Supervised Time-Series Classification}, 
  year={2023},
  volume={45},
  number={12},
  pages={15604-15618},
  doi={10.1109/TPAMI.2023.3308189}
}

Contact

Please contact me for any issues/questions regarding the paper or reproducing the results at: emad0002{at}e.ntu.edu.sg

ca-tcc's People

Contributors

emadeldeen24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

ca-tcc's Issues

Problem with self_supervised mode training

When i run the main.py with self_supervised mode training

the following error occurs.

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

def permutation(x, max_segments=5, seg_mode="random"):
orig_steps = np.arange(x.shape[2])

num_segs = np.random.randint(1, max_segments, size=(x.shape[0]))

ret = np.zeros_like(x)
for i, pat in enumerate(x):
    if num_segs[i] > 1:
        if seg_mode == "random":
            split_points = np.random.choice(x.shape[2] - 2, num_segs[i] - 1, replace=False)
            split_points.sort()
            splits = np.split(orig_steps, split_points)
        else:
            splits = np.array_split(orig_steps, num_segs[i])
        warp = np.concatenate(np.random.permutation(splits)).ravel()     (Error comes from this line)
        ret[i] = pat[0,warp]
    else:
        ret[i] = pat
return torch.from_numpy(ret)

how to solve it? Thanks

Regarding the dataset

Dear author, hello. If you have time, I have three questions to ask you:

  1. How many training modes should I run in order to run CATCC? Should we follow the order in your. sh file 【 "self_Superserved" 】
    "train_linear_1p"
    "ft_1p"
    "gen_pseudo_labels"
    "SupCon"
    Train_inear_SupCon_1p
    How about running it like this? So which running result is the final result of the model?
  2. If I want to use my own dataset to run the CATCC model. My dataset is also time series data, with a total of about 5000 pieces of data, which means there are 5000 samples. Half of them have labels, while the other half have no labels.
    In the initial self supervision section, should I use all 5000 of my data to run self supervision, then fine tune it with 2500 labeled data, label the remaining 2500 unlabeled data with pseudo labels, and use the 2500 labeled data for final training (or should I use all 5000 data for final training).
    Or should I use my 2500 unlabeled data in the initial self supervised section to run the self supervised mode?
  3. Normally, unlabeled data and labeled data should be two different types of data. Did you do this in revising the article CATCC?

Format of the dataset

Thank you very much for providing the code, but due to the lack of format details for data storage, I am unable to reproduce the training. Therefore, could you please provide an introduction to the storage details of the data or upload the file. Contact email: [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.