Git Product home page Git Product logo

flaginstruct's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

flaginstruct's Issues

Great work! Just a few questions regarding the next step

Hi, I just read the paper and really admire the ambition of it. I just have a few quick questions regarding it:

  1. In the abstract, the paper claims "it is still under-explored whether English-based foundation LLMs can perform similarly on multilingual tasks compared to English tasks with well-designed instruction tuning and how we can construct the corpora needed for the tuning." . However, I did not see the paper spend any effort to define the metrics to evaluate the models and conduct any ablation study to prove the effectiveness of the current data collection mechanism. I wonder are they on the agenda for the next step ?
  2. There is a recent paper from Microsoft "AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models" that suggest using standard exam to evaluate the performance of LLM. Though they have also released some college entrance exam questions, they are far from exhaustive. Is open sourcing the raw exam materials for research community an option here ? Or would it be interesting for BAAI to hold a CLUE style LLM leaderboard consisting of exam questions to evaluate LLM performances?

部分数据内容缺失

感谢开源,发现部分数据存在缺失内容(e.g. 未给出选项),比如:
{"textbox_q_instruction": "请选择正确的选项", "textbox_q_context": "智齿是人的第三大臼齿,用于切割食物。 现代人的饮食比古人的软,也更容易咀嚼,颚部因此变小,智齿往往长不出来。 有科学家认为,智齿将随着时间的推移而最终消失", "textbox_question": "这一现象蕴含的哲学道理是", "textbox_answer": "A", "textbox_answer_analysis": "A材料中讲“智齿将随着时间的推移而最终消失”,体现了量变达到一定程度必然会引起质变,故符合题意,当选; B表述错误,联系具有普遍性,但事物之间的联系是由条件的,并不是任何两个事物之间都是相互联系的,故排除; C表述错误,物质的唯一特性是客观实在性,而不是运动; D表述错误,事物的内部矛盾才对事物的发展起决定作用,外部矛盾是事物发展的外因,外因是事物发展的条件,但不起决定作用,故排除。", "subject": "政治"}

希望能修正更新一下

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.