I sincerely hope you could tell me how to handle this error？ Traceback (most recen

One error about memit HOT 3 OPEN

mumuyeye commented on July 17, 2024

One error

from memit.

Comments (3)

dtamayo-nlp commented on July 17, 2024

I ran into the same problem when using a different LLM. The problem you are finding is related to equation (14) of the paper MEMIT, in my case the problem I had was that the "aggregate statistic $C_0$" had rows and columns with zeros, and even summing $K_1 K_1^T$ those rows were still zero. When you have any row of zeros, the matrix is "singular", which implies that you cannot compute its inverse. If you look at the construction of these matrices, the existence of zeros implies that there are coordinates in the hidden states unused. However, how can you solve it?

I found two real solutions:

Easy solution. Do not retrain the layers that are having these problems. If you go to "hparams/MEMIT/EleutherAI_gpt-j-6B.json" you'll see that the layers that are being trained are:
"layers": [ 3, 4, 5, 6, 7, 8 ],
Change the matrices that are causing these problems, if you look at the causal trace you will see that you have some freedom to choose between them.
Hard solution: Remove the rows/columns that are full of zeros, compute the inverse of the matrix, and add the rows/columns of zeros again. Note that here you will not be computing the inverse, since some columns will be zero, but it will be an approximation that do not add noise. The problem I experimented with this solution is that even removing the zero row/columns, there were still some "unimportant" coordinates that were raising the norm of my delta matrix, which is making me cautious.
To implement this go to memit_main and add these lines to the beginning:

def make_null_i(matrix, i):
    new_matrix = matrix.clone()
    new_matrix[:,i] = new_matrix[:,i]*0
    new_matrix[i,:] = new_matrix[i,:]*0
    return new_matrix
def identify_null_cols(matrix):
    row_sums = matrix.clone().sum(dim=1)
    zero_rows = torch.nonzero(row_sums == 0).squeeze()
    return zero_rows.numel(), zero_rows.tolist()

def remove_column(matrix, i):
    new_matrix = matrix.clone()
    new_matrix = torch.cat((new_matrix[:i], new_matrix[i+1:]), dim=0)
    new_matrix = torch.cat((new_matrix[:, :i], new_matrix[:, i+1:]), dim=1)
    return new_matrix

def add_zero_column(matrix, i):
    new_matrix = matrix.clone()
    new_row = torch.zeros(1, matrix.shape[1], device=matrix.device, dtype=matrix.dtype)
    new_col = torch.zeros(matrix.shape[0] + 1, 1, device=matrix.device, dtype=matrix.dtype)
    new_matrix = torch.cat((new_matrix[:i], new_row, new_matrix[i:]), dim=0)
    new_matrix = torch.cat((new_matrix[:, :i], new_col, new_matrix[:, i:]), dim=1)
    return new_matrix

def compute_pseudoinverse_matrix(matrix):
    n, ids = identify_null_cols(matrix)
    print(f"There are {n} columns with zeros")
    if n==0:
        return torch.linalg.inv(matrix)
    # Remove the zero columns that are causing our matrix to be singular
    new_matrix = matrix.clone()
    for id_ in ids[::-1]:
        new_matrix = remove_column(new_matrix, id_)
    # Computing inverse
    new_matrix = torch.linalg.inv(new_matrix)
    # Rescaling the matrix
    for id_ in ids:
        new_matrix = add_zero_column(new_matrix,id_)
    return new_matrix

and then change the lines 196-199 to:

matrix = hparams.mom2_update_weight * cov.double().detach().cpu()+layer_ks.detach().cpu()@layer_ks.T.detach().cpu()
n_nul_cols,_ = identify_null_cols(matrix)
if n_nul_cols != 0:
    adj_k = compute_pseudoinverse_matrix(matrix) @ layer_ks.detach().cpu()
else:
    adj_k = torch.linalg.solve(matrix,layer_ks.detach().cpu())

Extra possible solution? Increment number of edits.

I hope it helps. Good luck!

from memit.

mumuyeye commented on July 17, 2024

I found your response to be quite valuable. Thank you very much！

from memit.

mumuyeye commented on July 17, 2024

I found your response to be quite valuable. Thank you very much！   丅 ***@***.***  

…

------------------ 原始邮件 ------------------ 发件人: ***@***.***>; 发送时间: 2023年10月9日(星期一) 晚上11:22 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [kmeng01/memit] One error (Issue #12) I ran into the same problem when using a different LLM. The problem you are finding is related to equation (14) of the paper MEMIT, in my case the problem I had was that the "aggregate statistic $C_0$" had rows and columns with zeros, and even summing $K_1 K_1^T$ those rows were still zero. When you have any row of zeros, the matrix is "singular", which implies that you cannot compute its inverse. If you look at the construction of these matrices, the existence of zeros implies that there are coordinates in the hidden states unused. However, how can you solve it? I found two real solutions: Easy solution. Do not retrain the layers that are having these problems. If you go to "hparams/MEMIT/EleutherAI_gpt-j-6B.json" you'll see that the layers that are being trained are: "layers": [ 3, 4, 5, 6, 7, 8 ], Change the matrices that are causing these problems, if you look at the causal trace you will see that you have some freedom to choose between them. Hard solution: Remove the rows/columns that are full of zeros, compute the inverse of the matrix, and add the rows/columns of zeros again. Note that here you will not be computing the inverse, since some columns will be zero, but it will be an approximation that do not add noise. The problem I experimented with this solution is that even removing the zero row/columns, there were still some "unimportant" coordinates that were raising the norm of my delta matrix, which is making me cautious. To implement this go to memit_main and add these lines to the beggining: ` def make_null_i(matrix, i): new_matrix = matrix.clone() new_matrix[:,i] = new_matrix[:,i]*0 new_matrix[i,:] = new_matrix[i,:]*0 return new_matrix def identify_null_cols(matrix): # Check if all elements in each row are zero row_sums = matrix.clone().sum(dim=1) zero_rows = torch.nonzero(row_sums == 0).squeeze() return zero_rows.numel(), zero_rows.tolist() def remove_column(matrix, i): new_matrix = matrix.clone() new_matrix = torch.cat((new_matrix[:i], new_matrix[i+1:]), dim=0) new_matrix = torch.cat((new_matrix[:, :i], new_matrix[:, i+1:]), dim=1) return new_matrix def add_zero_column(matrix, i): new_matrix = matrix.clone() new_row = torch.zeros(1, matrix.shape[1], device=matrix.device, dtype=matrix.dtype) new_col = torch.zeros(matrix.shape[0] + 1, 1, device=matrix.device, dtype=matrix.dtype) new_matrix = torch.cat((new_matrix[:i], new_row, new_matrix[i:]), dim=0) new_matrix = torch.cat((new_matrix[:, :i], new_col, new_matrix[:, i:]), dim=1) return new_matrix def compute_pseudoinverse_matrix(matrix): n, ids = identify_null_cols(matrix) print(f"There are {n} columns with zeros") if n==0: return torch.linalg.inv(matrix) # Remove the zero columns that are causing our matrix to be singular new_matrix = matrix.clone() for id_ in ids[::-1]: new_matrix = remove_column(new_matrix, id_) # Computing inverse new_matrix = torch.linalg.inv(new_matrix) # Rescaling the matrix for id_ in ids: new_matrix = add_zero_column(new_matrix,id_) return new_matrix ` and then change the lines 196-199 to: matrix = hparams.mom2_update_weight * ***@***.***_ks.T.detach().cpu() n_nul_cols,_ = identify_null_cols(matrix) if n_nul_cols != 0: adj_k = compute_pseudoinverse_matrix(matrix) @ layer_ks.detach().cpu() else: adj_k = torch.linalg.solve(matrix,layer_ks.detach().cpu()) 3) Extra possible solution? Increment number of edits. I hope it helps. Good luck! — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

from memit.

One error about memit HOT 3 OPEN

Comments (3)

Related Issues (17)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent