Git Product home page Git Product logo

Comments (7)

guody5 avatar guody5 commented on May 23, 2024

Bug has been fixed. The issue is caused due to latest pytorch. Change prevK = bestScoresId / numWords to prevK = bestScoresId // numWords in

prevK = bestScoresId / numWords
.

from ea-vq-vae.

ontocord avatar ontocord commented on May 23, 2024

Thank you @guody5. I'll check it out. If I understand the system properly, you are finding a vector that distinguishes the sense of an inference. A goes to the bank => A catches a fish that is closeer to vector X vs. A goes to the bank => A takes out some money that is closer to vector Y. And then mapping that to background knowledge that is closer to vector X or Y to the inference. So given the input rules and the vector Y, you find the story snippet such as "john went to the teller" and then infer, X takes out some money. Because you are using a generator, the theory is you will be able to generate new more reasonable inferences given new types of rules not yet seen by the system? Is this the gist of the system?

My question is why don't you just do clustering on context vectors. This would not cost the extra steps of computing the hidden vector X or Y, and then you can just do a lookup by vector similarty (A goes to bank (as in fish) matches "John went fishing", and then train the generator to infer "A cathes a fish". This is similar to the neural Q/A systems, and much simpler... Am I missing something?

from ea-vq-vae.

guody5 avatar guody5 commented on May 23, 2024

Here, I try to understand your meaning.

  1. In our method for training the generator, we first convert XY="A goes to the bank => A catches a fish" into h_{xy}, and then find the closer vector X. Finally, we find background knowledge according to X (closer distance).
  2. In your method for training the generator, we can directly convert "A goes to the bank => A catches a fish" into h_{xy}, and then find background knowledge according to h_{xy} (closer distance).

I am not sure whether I correctly understand your meaning.

The reason to use VQ-VAE is that inferences are unseen in the inference phase (mentioned in section 3.3.1). That is to say, we can only use "A goes to the bank" in the inference phase. For our method, we first convert X="A goes to the bank" into h_x and find top-k vectors in Equation 3 [X_1, X_2, X_3,...,X_k] that can help find different background knowledge (such as "john went to the teller" and "John went fishing") to infer, because each vector X_i contains different semantics.

For clustering method, we first convert "A goes to the bank" into h_x and then do a lookup by vector similarity. First, it's inconsistent between training phase h_{xy} and inference phase h_x. Second, it's hard to find background knowledge with different semantics only using h_x compared with [X_1, X_2, X_3,...,X_k].

from ea-vq-vae.

ontocord avatar ontocord commented on May 23, 2024

Hi @guody5. Thank you for your succient explantion. I understand that we don't want to feed in h_xy to find the background knowledge, b/c the task is given h_x and context vector, find background knowledge, and infer y. Sorry if I misunderstand.

If the goal is to find different reasonable inferences Y from ambigous Xs, my proposed method is to find the vector for X that is contextual. So you would transform "A person goes to the bank" in context 1 to "A person goes to the river" (what I meant by A goes to bank (as in fish) from above) based on WSD for low frequency words for example. This could be done neurally by running through "A person goes to the bank => the person catches some fish", and finding the top-k token for the slot "bank" that doesn't have the word "bank" inside it.

from transformers import AutoModelForMaskedLM, AutoTokenizer, AutoModelForSeq2SeqLM
import torch
model = AutoModelForMaskedLM.from_pretrained("distilroberta-base")
tokenizer = AutoTokenizer.from_pretrained("distilroberta-base")
input_txt = ["A person goes to the bank => A person catches a fish", "A person goes to the bank => A person takes out some money"]
model = model.eval().cuda().half()
all_outputs = []
with torch.no_grad():
  for txt in input_txt:

    # replace low frequency words with <mask>
    masked_word = "bank"
    inputs = tokenizer(txt, return_tensors='pt', add_special_tokens=True, padding=True)
    outputs = model(input_ids=inputs.input_ids.cuda(), return_dict=True)
    predictions = outputs.logits

    for pred in predictions:
        sorted_preds, sorted_idx = pred.sort(dim=-1, descending=True)
        output = []
        for k in range(10):
            predicted_index = [sorted_idx[i, k].item() for i in range(0,len(predictions[0]))]
            predicted_token = tokenizer.convert_ids_to_tokens([predicted_index[6]])[0].replace('Ġ', '').replace('  ', ' ').replace('##', '')
            if masked_word.lower() in predicted_token.lower():
              continue
            output.append(txt.split("=>")[0].replace(masked_word, predicted_token).strip())
        all_outputs.append(output)
            
print (all_outputs)

Produces: [['A person goes to the water', 'A person goes to the river', 'A person goes to the back', 'A person goes to the lake', 'A person goes to the corner'], ['A person goes to the branch', 'A person goes to the money', 'A person goes to the trust', 'A person goes to the account', 'A person goes to the check']]

Convert above to context vectors h_x that is contextual to h_y. And match background knowledge to h_x. Given background knowledge (and perhaps the X), train the generator to output "The person catches some fish". This is similar to some Q/A systems that uses background knowledge. "A person goes to the river" is basically the Q, and the answer is "The person catches fish", using background knowledge. Using this paradigm, you can do other things that is explained in the literature for these types of Q/A system, like mapping the background knowledge and question to a (smaller) vector that are close to each other for better retrieval, etc. Using clustering to refine the mapping, etc.

What I think would also be interesting from your work (and potentially the above), is generating a dataset of common sense inference from the background knowledge itself. So given your system, you could create a dataset such as "Tom went to the river" infers "A person goes to the bank => the person catches some fish". And then seeing what new inferences the system will make with new background knowledge. Like, "Jane went to the corner market" infers "A person goes to the store => the person buys some food."

from ea-vq-vae.

guody5 avatar guody5 commented on May 23, 2024

@ontocord Your method is great and I think it's also a good way to solve the task. Thanks for your promising method and your code.

from ea-vq-vae.

ontocord avatar ontocord commented on May 23, 2024

Thanks for the input @guody5. I'll play around with this and report back.

from ea-vq-vae.

guody5 avatar guody5 commented on May 23, 2024

I will close this issue. If you have any problems, feel free to ask.

from ea-vq-vae.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.