What is SNT file and how to create a new file about seq2seqsharp HOT 10 CLOSED

zhongkaifu commented on May 18, 2024

What is SNT file and how to create a new file

from seq2seqsharp.

Comments (10)

zhongkaifu commented on May 18, 2024 1

Hi @GeorgeS2019

You could use Seq2SeqSharp to train GPT-x models only if you have training data set for it. They are all Transformer based model and data set is masked text.

from seq2seqsharp.

zhongkaifu commented on May 18, 2024

Hi @axel578 ,

In the demo and release package, SNT is data set for training and test rather than vocab file.

For vocab file, it could be either generated from SNT file or use external files as vocab files.

In vocab file, one token per line and each line has two parts: [token] \t [weights]
[weight] could be any value you want. Seq2SeqSharp doesn't use these [weight] for now.

Thanks
Zhongkai Fu

from seq2seqsharp.

axel578 commented on May 18, 2024

Thanks for the answer !

I'd like to know what are the two src and target model in fiction text generation (enuSpm.model)

.\bin\Seq2SeqConsole\Seq2SeqConsole.exe -Task Test -ModelFilePath .\model\seq2seq_fiction.model -InputTestFile .\data\test\test_fiction.txt -OutputPromptFile .\data\test\test_fiction.txt -OutputFile out_fiction.txt -MaxTestSrcSentLength 256 -MaxTestTgtSentLength 512 -ProcessorType CPU -SrcSentencePieceModelPath .\spm\enuSpm.model -TgtSentencePieceModelPath .\spm\enuSpm.model -BeamSearchSize 1 -DeviceIds 0,1,2,3 -DecodingStrategy Sampling -DecodingRepeatPenalty 10

from seq2seqsharp.

zhongkaifu commented on May 18, 2024

For this command line, "test_fiction.txt" is input file. It's used as input for encoder, and prompt for decoder. "out_fiction.txt" is output file generated by decoder.

from seq2seqsharp.

GeorgeS2019 commented on May 18, 2024

@zhongkaifu

Different Text Generation Strategy: ArgMax, Beam Search, Top-P Sampling

Just curious, how is this language generation implemented similar or dissimilar to e.g. GPT-x ?

from seq2seqsharp.

axel578 commented on May 18, 2024

Yes but what is .\spm\enuSpm.model for ? is it for the vocabulary, because on my scenario the vocabulary is a bunch of code different than the usual vocabulary

from seq2seqsharp.

zhongkaifu commented on May 18, 2024

enuSpm.model is for SentencePiece to encode/decode for subword level tokens. Seq2SeqSharp can directly call APIs in SentencePiece for subword level encoding and decoding. SentencePiece has its own vocabulary in subword level, and it's different with your vocabulary.

I don't think you need to care about it, because with parameters "-SrcSentencePieceModelPath" and -TgtSentencePieceModelPath", Seq2SeqSharp can automatically encode word in your vocabulary to subword in model vocabulary, and decode subword back to word. With these two parameters, if you don't have vocabulary in subword level, you can set "SrcVocab" and "TgtVocab" to empty, and ask Seq2SeqSharp to generate vocabulary from training set. For inference, model itself already includes vocabulary.

from seq2seqsharp.

GeorgeS2019 commented on May 18, 2024

@zhongkaifu

You could use Seq2SeqSharp to train GPT-x models only if you have training data set for it

Is importing trained weights from e.g. GPT2 .onnx into Seq2SeqSharp model to avoid training is still part of a long term plan?

from seq2seqsharp.

zhongkaifu commented on May 18, 2024

@zhongkaifu

You could use Seq2SeqSharp to train GPT-x models only if you have training data set for it

Is importing trained weights from e.g. GPT2 .onnx into Seq2SeqSharp model to avoid training is still part of a long term plan?

y... it's still a long-term plan, but I don't have specific timeline for it.

I actually already chatted with ONNX runtime team a last year, and operator translation between Seq2SeqSharp and ONNX is pretty straightforward, but this is not my urgent task for Seq2SeqSharp for now, because Seq2SeqSharp already supports large model training and fine-tuning for my daily works, and my work is not based on GPT-X models.

Thanks
Zhongkai Fu

from seq2seqsharp.

zhongkaifu commented on May 18, 2024

I know there is a need for modular and reusable, but it's not high priority and urgent for me right now. This is the reason why I say it's a long-term plan.

from seq2seqsharp.

What is SNT file and how to create a new file about seq2seqsharp HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent