emadeldeen24 / ts-tcc Goto Github PK
View Code? Open in Web Editor NEW[IJCAI-21] "Time-Series Representation Learning via Temporal and Contextual Contrasting"
License: MIT License
[IJCAI-21] "Time-Series Representation Learning via Temporal and Contextual Contrasting"
License: MIT License
Hi!
In the paper (Results section), you mentioned that the performance is evaluated using two metrics: accuracy and the macro-averaged F1-score (MF1). But when I run main.py, I get only accuracy metric results.
'''
total_acc.append(labels.eq(predictions.detach().argmax(dim=1)).float().mean())
'''
Can you let me know how MF1 is calculated as well?
Thanks in advance.
Why we are doing augmentation on the whole dataset only once and we are using this dataset during the self supervised training?
Shouldn't we augment the dataset at each epoch?
Thank you for your work. If I want to input a custom data set, how do I configure the network parameters? My data set size is 1189, 1, 10000 (data set size, data dimension, data length), looking forward to your reply
dat_dict = dict()
dat_dict["samples"] = torch.from_numpy(train_data) #7352 In my opinion, it should be X_train instead of train_data
dat_dict["labels"] = torch.from_numpy(y_train) #5881
torch.save(dat_dict, os.path.join(output_dir, "train.pt"))
Please check it. Thanks
Hi, great job! I have a question about implementation. Why are two optimizers used instead of one since all the settings of the two optimizers are exactly the same?
Hi,
I find that there is no “supervised” branch in the "main.py". I guess the model is trained with random initialization only by supervised loss in this training mode. Is it right?
Thank you
Thank you for your interesting work! I have a question regarding the implementation of your loss implementation on
and this is the code
for i in np.arange(0, self.timestep):
total = torch.mm(encode_samples[i], torch.transpose(pred[i], 0, 1))
nce += torch.sum(torch.diag(self.lsoftmax(total)))
nce /= -1. * batch * self.timestep
At first sight, it was hard to understand how the code matches the equation in the paper. To me, it seemed that you have only implemented the numerator part of the equation in the code. However, after some thought, it seemed that the total
matrix contains elements from both the numerator and denominator. Then by applying a logsoftmax function, you are bounding this matrix to some limit. By only adding the elements in the diagonal terms(=numerator) and adding this as a negative loss, you are essentially making the diagonal terms smaller while making the off-diagnoal terms(=denominator) bigger. This is how I understood. Could you please let me know if this statement is correct?
thank you!
Hello Emadeldeen,
Recently I noticed your new paper---Self-supervised Contrastive Representation Learning for Semi-supervised Time-Series
Classification on arxiv.
Could you share the TS-TCC code and other methods (e.g., simclr, CPC) also?
I found that in 'SupConLoss::forward' of 'loss.py', logits are defined by anchor_Dot_Contrast subtracts the maximum value of each line from itself, but if the distribution of anchor_Dot_Contrast is uneven, where the maximum value of a certain row is much larger than the other values of that row, will result in the logits being composed of some smaller negative numbers (such as -208.3) and 0. Due to the use of torch. float32 precision in the code, in this case, there is only one 1 in a certain line of torch. exp (logits), and the rest are all zero (and approximately zero); Even worse, if it's related to logits_ Multiplying masks will result in exp_ All logits in a certain line are 0. This will result in a log_ Prob=logits - torch. log (exp_logits. sum (1, keepdim=True)) becomes infinite, resulting in the loss being calculated as Nan. This will be an unacceptable consequence.
We have used the same dataset and code as shared and haven't changed any configs; but in the linear evaluation experiment we are not getting the results near to the one shown in the paper; have you used any other evaluation process to get the results as presented in the paper like k cross validation or any change in the number of epochs pretrained?
In strong augmentation, is it correct that pat[0,warp]?
I think pat[:,warp] should be correct.
Could you check below code?
for i, pat in enumerate(x):
if num_segs[i] > 1:
if seg_mode == "random":
split_points = np.random.choice(x.shape[2] - 2, num_segs[i] - 1, replace=False)
split_points.sort()
splits = np.split(orig_steps, split_points)
else:
splits = np.array_split(orig_steps, num_segs[i])
warp = np.concatenate(np.random.permutation(splits)).ravel()
ret[i] = pat[0,warp]
else:
ret[i] = pat
return torch.from_numpy(ret)
Please, can you add the license for your repo.
While recently implementing TS-TCC I ran into the same issue as one raised here by vamsi231297 a few years ago that self-supervised TS-TCC only augments once at the beginning of training.
To my knowledge all other methods randomly augment each data sample at each epoch, and not just once at the start. See for example Mocov2 (https://github.com/facebookresearch/moco/blob/main/main_moco.py#L351). This is the usual way to prevent overfitting & increase generalization.
I implemented augmenting each epoch rather than just once at the start, and found that it increased the performance of TS-TCC on the HAR dataset. Linear probe after self supervised phase went from 0.9160 accuracy to 0.9294. 9.294 is higher than the values and errorbars reported in Table 2 of the paper.
Saying that, I am also finding that training for more epochs with the rest of the parameters unchanged results in better performance than shown in Table 2 of the paper. E.g. 1000 epochs of supervised training gets accuracy=0.9408. Changing the learning rate also can get better accuracies.
So I would not take my test at face value as an improvement of TS-TCC over supervised as it seems likely that my results and those of Table 2 are highly dependent on hyperparameters and seeds.
Mat_0(k,a,:)=vib_0(i:i+wind_size);
What is the value of wind_size?
I guess wind_size == sample_len?
However, after that, i see this error:
>> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Error using reshape Number of elements must not change. Use [] as one of the size inputs to automatically calculate the appropriate size for that dimension.
I am currently trying to get the repository running. Soon as I try to run the self-supervised part, I run into the following error.
I downloaded the datasets from the dataverse... I understand the error message and can follow it, but I am wondering why it does not work for me out of the box. Is there anything I am missing here?
> python main.py --training_mode self_supervised --selected_dataset HAR
=============================================
Dataset: HAR
Method: TS-TCC
Mode: self_supervised
=============================================
Traceback (most recent call last):
File "/home/TS-TCC/main.py", line 85, in <module>
train_dl, valid_dl, test_dl = data_generator(data_path, configs, training_mode)
File "/home/TS-TCC/dataloader/dataloader.py", line 51, in data_generator
train_dataset = Load_Dataset(train_dataset, configs, training_mode)
File "/home/TS-TCC/dataloader/dataloader.py", line 33, in __init__
self.aug1, self.aug2 = DataTransform(self.x_data, config)
File "/home/TS-TCC/dataloader/augmentations.py", line 8, in DataTransform
strong_aug = jitter(permutation(sample, max_segments=config.augmentation.max_seg), config.augmentation.jitter_ratio)
File "/home/TS-TCC/dataloader/augmentations.py", line 42, in permutation
warp = np.concatenate(np.random.permutation(splits)).ravel()
File "numpy/random/mtrand.pyx", line 4720, in numpy.random.mtrand.RandomState.permutation
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.
def scaling(x, sigma=1.1):
# https://arxiv.org/pdf/1706.00527.pdf
factor = np.random.normal(loc=2., scale=sigma, size=(x.shape[0], x.shape[2]))
ai = []
for i in range(x.shape[1]):
xi = x[:, i, :]
ai.append(np.multiply(xi, factor[:, :])[:, np.newaxis, :])
return np.concatenate((ai), axis=1)
In this paper, weak augmentation is a
jitter-and-scale strategy. Specifically, we add random variations to the signal and scale up its magnitude. But why no jitter in weak augment.
Hi @emadeldeen24, I found this project interesting, but I still need to understand completely how I can obtain labels on the fully unsupervised dataset. Once I've completed the training in self-supervised mode, how can I get the predictions for each time series provided in the input?
I've seen that you use Fine-tuning or the linear classifier, but they require some data labels in input. How can I use a completely unsupervised version?
I have tried the TS-TCC to build a classifier for EEG dataset.
However, the training and validation accuracy are greater 90%, but the test accuracy is only 44%.
So would you like to give me some advice or suggestions ?
Thanks!
Hi, great job! can this model be used for time series forecasting?How's the effect
When i run the main.py with self_supervised mode training
the following error occurs.
Input tensor shape: torch.Size([128, 8, 100]). Additional info: {'h': 3}.
Shape mismatch, can't divide axis of length 100 in chunks of 3
The error comes from the source code in line 57-58, as is shown as below
43 class Attention(nn.Module):
44 def init(self, dim, heads=8, dropout=0.):
45 super().init()
46 self.heads = heads
47 self.scale = dim ** -0.5
48
49 self.to_qkv = nn.Linear(dim, dim * 3, bias=False)
50 self.to_out = nn.Sequential(
51 nn.Linear(dim, dim),
52 nn.Dropout(dropout)
53 )
54
55 def forward(self, x, mask=None):
56 b, n, _, h = *x.shape, self.heads
57 qkv = self.to_qkv(x).chunk(3, dim=-1)
58 q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> b h n d', h=h), qkv)
how to solve it? Thanks
hello. I have read your paper and code. In your code, I am so confused about the experiment process of fine-tune for self-supervised model. The following is my understanding: First you trained the pre-trained model using train.pt data. Then, in fine-tune, copy the pretrained model parameter in model, the next is my cofused: why do you conduct supervised training using train.pt again? why are the parameters of the pre-trained model not frozen? In your abstract, you write you propose an unsupervised method, Where does unsupervision manifest itself? In my understanding, in fine-tune process, you should forzen the pre-trained model parameter, and using fewer labeled data to finetune the classifier after the pretrained model.
Looking forward to your answer, thank you.
Hi, thanks for releasing your code! I am trying to reproduce the results from the paper, as well as extend to other datasets, but run into some problems due to insufficient information to recreate the datasets.
Thank you!
During the self-supervised training stage the training loss is not decreasing much it's hovering around 10. But in the fine tuning stage we are getting good accuracy. Is there any prominence to the self supervised training loss and should we even consider looking at the self supervised training loss values?
Hi Emadeldeen. Thanks for sharing your work! Just one quick question. I found there are no val.pt files for HAR and Epilepsy datasets. Is there any particular reason for this?
Hi there,
I was taking a closer look in the implementation of the contextual contrasting loss and and I am having trouble understanding how the positive samples are being treated differently from the negative ones, and specifically how this corresponds to the Eq. 5 of your paper. Could it be that labels
should have some elements equal to one and not all of them being zeros?
Lines 61 to 62 in ebfdbab
Thanks!
Hi, thank you for the great paper! Is it possible to implement TS-TCC in the images? Would it be meaningful?
Dear authors,
Recently I have been studying your paper and code, however, even with your proceesed dataset, I can not repeat your reported results in your paper. Take the HAR dataset as example, there comes the following issues.
Hope you could share me your experiment setting and training logs to remove doubts.
The loss, temp_cont_loss1, temp_cont_loss2, cannot decrease during training in our dataset, how to solve it? Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.