nilboy / gaic_track3_pair_sim Goto Github PK
View Code? Open in Web Editor NEW全球人工智能技术创新大赛-赛道三-冠军方案
全球人工智能技术创新大赛-赛道三-冠军方案
你好,请问gaiic_track3_round1_testB_20210317.tsv测试文件能提供一下吗?非常感谢!
另外,在看您的代码时有些疑惑,根据docker run走的流程如下:
run.sh->run_inner_2.sh-> pipeline/pipeline_d.py->process_data_s1.sh,然后执行了下面两个.py
convert_data.py --n_splits=8
process_oov_data.py
convert_data:对train.tsv抽取字表,字:字频 保存为normal_vocab.json,字:索引 保存为idmap.json;然后利用这两个表把train.tsv和test.tsv转为id表示后保存。
convert_data.py:这里用construct_vocab函数创建了另一个vocab.json(不同于idmap.json),然后用convert_record_style函数根据vocab.json把之前保存的train.tsv和test.tsv(都用idmap.json转为id了)还原成文字,转完是乱码一样的文字。我疑惑的是为什么用不一样的词表转换呢?为什么这么做?
感谢大佬开源方案,一边看代码一边跑从中学习了很多, 有些两个问题想要请教一下:
如果大佬还记得相关的细节,麻烦指导下,再次谢谢
求教,多个模型是如何融合的?没太看懂代码
方案一中train.sh中,先预训练M个模型,然后训练M*K个kfold分类模型,并用这些分类模型对kfold数据打标签,得到classification的soft label的训练数据A。然后用ensemble模型,训练数据A训练kfold回归模型,然后再给kfold数据打标签,然后得到regression的soft label的训练数据B,然后在用ensemble模型,训练数据B训练全量回归模型。
请教下为什么要打两次softlabel标签呢?基于分类模型来预测的softlabel,直接训练一次回归模型可以吗
天池网站的数据已经无法获取,可否给一个链接获取比赛数据
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.