Git Product home page Git Product logo

Comments (6)

wq2012 avatar wq2012 commented on May 7, 2024

backward is the step to compute gradient so it's supposed to be the most expensive step.

The backward is not associated with the dataset, but associated with the size of input.

You can try to break the dataset into subsets and call the fit function multiple times as suggested in README.md.

from uis-rnn.

hcfeng201 avatar hcfeng201 commented on May 7, 2024

backward is the step to compute gradient so it's supposed to be the most expensive step.

The backward is not associated with the dataset, but associated with the size of input.

You can try to break the dataset into subsets and call the fit function multiple times as suggested in README.md.

Sir,
I break the dataset into 2 subsets, each subsets include 41000 elements which Smaller than the test train_sequence you provided (47350).But it still cost 8 seconds in each iteration, and with the data you provide, it took less than a second.

from uis-rnn.

wq2012 avatar wq2012 commented on May 7, 2024

Is 41000 the number of time steps, or the feature dimension? If former, what is your feature dimension?

Also, did you normalize the features before you feed them into uisrnn? If not, what's the range of your features?

from uis-rnn.

hcfeng201 avatar hcfeng201 commented on May 7, 2024

Is 41000 the number of time steps, or the feature dimension? If former, what is your feature dimension?

Also, did you normalize the features before you feed them into uisrnn? If not, what's the range of your features?

41000 is the feature dimension of a 2-dim numpy array (41000, 256), like your feature dimension(47350, 256). The normalize you mentioned is "The embedding vector (d-vector) is defined as the L2 normalization of the network output"? I extracted d-vector by "PyTorch_Speaker_Verification". I think the normalize is done.

from uis-rnn.

wq2012 avatar wq2012 commented on May 7, 2024

You will need to discuss this with the author of PyTorch_Speaker_Verification.

We are not responsible for the correctness or any issue of third-party libraries.

from uis-rnn.

Aurora11111 avatar Aurora11111 commented on May 7, 2024

@hcfeng201
you should change the embedding create dome:
for file in os.listdir(folder):
if file[-4:] == '.wav':
# subprocess.call(['ffmpeg', '-i', 'file', file[-4:]+'.wav'])
print(folder + '/' + file)
times, segs = VAD_chunk(2, folder + '/' + file)
print("times" * 10, times)
print("segs" * 10)

    if segs == []:
        print('No voice activity detected')
        continue
    concat_seg = concat_segs(times, segs)
    STFT_frames = get_STFTs(concat_seg)
    STFT_frames = np.stack(STFT_frames, axis=2)
    STFT_frames = torch.tensor(np.transpose(STFT_frames, axes=(2, 1, 0)))
    embeddings = embedder_net(STFT_frames)
    # print(embeddings)
    aligned_embeddings = align_embeddings(embeddings.detach().numpy())
    train_sequence.append(aligned_embeddings)
    for embedding in aligned_embeddings:
        train_cluster_id.append(str(label))
    label += 1
    test_sequence = np.concatenate(train_sequence, axis=0)
    test_cluster_id = np.asarray(train_cluster_id)

np.save('test_sequence', test_sequence)
np.save('test_cluster_id', test_cluster_id)
print("%" * 100)
print(test_sequence.shape, type(test_sequence))

and change uis-rnn test demo:
test_sequence = np.load('./data/test_sequence.npy')
test_cluster_id = np.load('./data/test_cluster_id.npy')

model = uisrnn.UISRNN(model_args)

model.load(SAVED_MODEL_NAME)
#testing
print("%" * 100)
print(test_sequence.shape, type(test_sequence))
print(test_cluster_id, type(test_cluster_id))
#for (test_sequence, test_cluster_id) in zip(test_sequences, test_cluster_ids):
predicted_label = model.predict(test_sequence, inference_args)
predicted_labels.append(predicted_label)
accuracy = uisrnn.compute_sequence_match_accuracy(list(test_cluster_id), predicted_label)
test_record.append((accuracy, len(test_cluster_id)))
print('Ground truth labels:')
print(test_cluster_id)
print('Predicted labels:')
print(predicted_label)
print('-' * 80)
output_string = uisrnn.output_result(model_args, training_args, test_record)
print('Finished diarization experiment')
print(output_string)

from uis-rnn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.