Git Product home page Git Product logo

Comments (4)

fmikaelian avatar fmikaelian commented on May 24, 2024

The function read_squad_examples() takes a json file as input. This json is converted to a list of dicts with SQuAD format with json.load(reader)["data"].

We could therefore do the following at prediction time:

  • Get user question
  • Run document retriever based on question
  • Generate this list of dicts with SQuAD format with the same question asked on every paragraphs and empty qas key.
  • Run the document reader on this list of dicts
  • Sort predictions

from cdqa.

fmikaelian avatar fmikaelian commented on May 24, 2024

Actually, the converter we built does something similar to this.

from cdqa.

fmikaelian avatar fmikaelian commented on May 24, 2024
import uuid

def generate_squad_examples(question, article_indices, metadata):
    
    squad_examples = []
    
    metadata_sliced = metadata.loc[article_indices]
    
    for index, row in tqdm(metadata_sliced.iterrows()):
        temp = {'title': row['title'],
               'paragraphs': []}
        
        for paragraph in row['paragraphs']:
            temp['paragraphs'].append({'context': paragraph,
                                      'qas': [],
                                      'question': question,
                                      'id': str(uuid.uuid1())})
            
            squad_examples.append(temp)

    return squad_examples

Then we can call this in the example:

squad_examples = generate_squad_examples(question='Who is the creator of Artificial Intelligence?',
                               article_indices=article_indices,
                               metadata=df)

Outputs:

[{'title': 'Artificial Intelligence: more revolutionary than the Internet!',
  'paragraphs': [{'context': 'BNP Paribas launches the prototype AGORA, first online community for corporate clients',
    'qas': [],
    'question': 'Who is the creator of Artificial Intelligence?',
    'id': 'bec64330-3b40-11e9-8dad-0242ac110012'},
   {'context': 'Artificial Intelligence has progressed at lightning speed in recent years. Machines are now able to beat humans in Go matches, understand natural language, reason and learn. As a result, software and robots have something to offer in every field to make business more productive, profitable and innovative. Chronicle of a revolution foretold.',
    'qas': [],
    'question': 'Who is the creator of Artificial Intelligence?',
    'id': 'bec6701c-3b40-11e9-8dad-0242ac110012'},
   {'context': 'Artificial Intelligence refers to a set of technologies – machine learning, deep learning, language processing, etc. – that share one common feature in that they rely on a computer system capable of analyzing, understanding, learning and discovering connections between things, facts and events as well as manipulating concepts. It should come as no surprise that machines have acquired these extraordinary abilities. Just like flying cars, autonomous and hyper-intelligent humanoid robots have been a major part of science fiction for decades.',
    'qas': [],
    'question': 'Who is the creator of Artificial Intelligence?',
    'id': 'bec67102-3b40-11e9-8dad-0242ac110012'},
   {'context': '“Artificial Intelligence is a word that has been around for 60 years, but which ultimately refers to nothing more than software. Machines are very good at performing repetitive tasks and can help humans work more efficiently. But they cannot take their own initiatives and can only make progress by interacting with people”, explains Edouard d’Archimbaud, manager of the Data Science & Artificial Intelligence Lab at BNP Paribas CIB. ',
    'qas': [],
    'question': 'Who is the creator of Artificial Intelligence?',
    'id': 'bec67238-3b40-11e9-8dad-0242ac110012'}],
 {'title': 'Sugiyama to lead Japan in France Fed Cup clash (AFP)',
  'paragraphs': [{'context': 'Machine learning, deep learning, artificial intelligence—Julien Dinh, Senior Research Lead at...',
    'qas': [],
    'question': 'Who is the creator of Artificial Intelligence?',
    'id': 'bec68e6c-3b40-11e9-8dad-0242ac110012'}]}]

from cdqa.

fmikaelian avatar fmikaelian commented on May 24, 2024

Question: Who is the creator of Artificial Intelligence?

Predictions returned by predictions = model.predict(X=(test_examples, test_features)) are:

(OrderedDict([('2398202a-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('239828b8-41b4-11e9-beaa-796013f1ec43',
               'Chronicle of a revolution'),
              ('2398294e-41b4-11e9-beaa-796013f1ec43',
               'machine learning, deep learning, language processing, etc.'),
              ('23983056-41b4-11e9-beaa-796013f1ec43', 'Edouard d’Archimbaud'),
              ('2398309c-41b4-11e9-beaa-796013f1ec43', 'AI'),
              ('239830e2-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('23983128-41b4-11e9-beaa-796013f1ec43', 'Marvin Lee Minsky'),
              ('23983164-41b4-11e9-beaa-796013f1ec43',
               'Artificial Intelligence is in fact likely to surpass humans in performing tasks that require reasoning and learning.'),
              ('239831a0-41b4-11e9-beaa-796013f1ec43', 'Watson'),
              ('239831e6-41b4-11e9-beaa-796013f1ec43', 'Google'),
              ('2398322c-41b4-11e9-beaa-796013f1ec43', 'Accenture'),
              ('23983268-41b4-11e9-beaa-796013f1ec43', 'AI'),
              ('239832a4-41b4-11e9-beaa-796013f1ec43', 'Partnership on AI'),
              ('239832e0-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('23983326-41b4-11e9-beaa-796013f1ec43', 'Edouard d’Archimbaud'),
              ('23983362-41b4-11e9-beaa-796013f1ec43', 'data scientists'),
              ('2398339e-41b4-11e9-beaa-796013f1ec43', 'Edouard d’Archimbaud'),
              ('239833e4-41b4-11e9-beaa-796013f1ec43',
               'AI system’s ability to learn “by example” or “by experience”.'),
              ('23983420-41b4-11e9-beaa-796013f1ec43',
               'Deep learning is a learning technology that uses artificial neural networks, which approximate human learning to process “raw data”.'),
              ('2398345c-41b4-11e9-beaa-796013f1ec43', 'Alan Turing'),
              ('23983498-41b4-11e9-beaa-796013f1ec43', 'TEDxParis'),
              ('239834d4-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('23983510-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('23983a60-41b4-11e9-beaa-796013f1ec43', 'change management'),
              ('23983ad8-41b4-11e9-beaa-796013f1ec43', 'BNP Paribas'),
              ('23983b1e-41b4-11e9-beaa-796013f1ec43', 'Julien Dinh'),
              ('23983f92-41b4-11e9-beaa-796013f1ec43', 'Julien Dinh')]),
 OrderedDict(),
 OrderedDict())

The ground truth is Marvin Lee Minsky, available in context 23983128-41b4-11e9-beaa-796013f1ec43:

{'context': 'One of the creators of Artificial Intelligence, Marvin Lee Minsky, notably defines it as “the construction of computer programs that engage in tasks that are, for now, more satisfactorily accomplished by humans because they require high-level mental processes”. ',
    'qas': [{'answers': [],
      'question': 'Who is the creator of Artificial Intelligence?',
      'id': '23983128-41b4-11e9-beaa-796013f1ec43'}]},
  • How to get the best answer from predictions (see #36) ?
  • What is nbest_predictions.json (empty in my case) ?

from cdqa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.