Git Product home page Git Product logo

cotk's Issues

[Feature] Report system

Write a script that push results to dashboard

Command:
'''
cotk-report [--result result.json] [--only-upload] [--entry main] [other parameter]
'''
result: indicates the test results.
only-upload: indicates push results without running model
entry: means the entry point of models

If running in only upload, the result should be comparable
If runing in full mode, the result can reproducible

Provide a list of api for dashboard

[Model] SeqGAN

Refer to SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient

[BUG] fix hred test

Describe the bug
hred test is wrong.

Why the turn of generated sentences > turn of reference ???

[Enhancement] Adapt test for metric using allvocabs

Description:

Now dataloader have added new attributes: valid vocabs and invalid vocabs
valid vocabs mean the vocabularies used by models
all vocabs(== valid vocabs + invalid vocabs) mean the vocabularies used by metrics.
If a word is not any kind of all vocabs, it is unkown vocabs, which are ignored by metrics.

Metric unittest must be adapted for new metrics.

Requirements:

  • Pull invalid_vocab branch
  • FakeDataloader should have new attributes like all_vocab_size, ...
  • Bleu & Recorder metrics have to use all vocabs
  • Perplexity used a smoothing algorithm (You can see the code in PerlplexityMetric as reference):
    • If models predict valid vocabs, perplexity is calculated as it was
    • If models predict UNK, the probability is divided evenly to invalid vocabs
    • If the reference is UNK, the word is ignored.
      So, you have to write tests for the new PerplexityMetric and MultiturnPerplexityMetric
      Try to cover the 3 conditions above.

[Models] CopyNet

Refer to Incorporating Copying Mechanism in Sequence-to-Sequence Learning.

[Model] HRED

Refer to Building end-to-end dialogue systems using generative hierarchical neural network models

[Model] CVAE

Refer to Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders

[Enhancement] gather download links of data

Gather the download links of data, make a 'dataset_config.json' in ./contk/dataloader

{
"MSCOCO": "https://XXXX"
}

It is best reference from the original link, can use gzip or other compressed format.

[Enhancement] Vocab List in Dataloader

For implemention of #8 copynet, dataloader should change behaviours.

In our mind, there should be 3 vocab list:

  • For model trainning, smallest. Only include words from train set. Call it set $V.
  • For metric, bigger. The model will be evaluated on this vocab list, including words from train set and test set. Call it set $M. But almostly all models can't generate words from $V-$M, because they haven't seen these. Howerver, copyNet can gen words from $V-$M by copy mechanism. It's necessary to take these words into accounts when we implement metrics. $V-$M can be expressed as UNK token for some models. Dataloader have to tranlate them into a uniform distribution on $V-$M.
  • The whole space of word, include not seen in all the data. Call it set $N. The words in $N-$M, we don't care about them, ignore them in evaluating models, as #37 . $N-$M is the TRUE UNK.

Require:

  • Change the behavior of dataloader, metric.

[Maintenance] Refactor dataloader of SwitchBoard

  • _build_vocab has to use multi_ref data
  • renamed to inference metric. embedding should have a default realization (use wordvec from Glove)
  • add unittest for unique feature of SwitchBoard

  • add hashvalue

[Enhancement] Make unit test for models

Requirement

  • Run models test only in cpu mode
  • Just check the arguments and the connection with the main library
  • Don't need to check performance
  • make the test standalone, because it may need packages like tensorflow or pytorch.

[Feature] Use a stable link on github for data

User may use same id to download same data from different sources:

like “glove” default from github
"glove~github" explicit from github
"glove~tsinghua" explicit download from coai.tsinghua

[BUG] bleu will crash

Describe the bug
BleuMetric will crashed when len(hypothesis) == 1?
possible because of smoothingFunction?

It's an upstreaming bug, just comment and give up

To Reproduce

checked

[Enhancement] Metrics check whether models use the same data

Problems

It may be hard to evaluate 2 models using the same test data in the same way.
So it's important to make the metrics be able to telling which data is used.

Proposal A

Make metrics binding the dataloader. Data must be processed in the same order.

Drawback:

  • must be in same order

Proposal B

Make a hash value of data. It's able to tell the differences.

Drawback:

  • hard to find bugs

[BUG] typo in metric.py

Describe the bug
PerlplexityMetric ->PerplexityMetric

Move ./tests/dataloader/test_metric to ./tests/metric/test_metric

[Model] LSTM language modelling

Write a model for Language Generation Dataloader. Either in tensorflow or pytorch.

If you write in tensorflow, please use a newer version of tensorflow like 1.13.

Tests are required.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.