Hi, thanks for your amazing work! How can I extract representations

Here's how I'm currently solving this (adapted from <a href="https://github.com/evo-de

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Quoting from this issue <a class="issue-link js-issue-link" data-error-text="Failed to

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Extracting EVO representations rather than logits about evo HOT 5 OPEN

amoskalev commented on August 27, 2024 5

Extracting EVO representations rather than logits

from evo.

Comments (5)

davidkell commented on August 27, 2024 8

Here's how I'm currently solving this (adapted from usage in README) :

from evo import Evo
import torch

device = 'cuda:0'

evo_model = Evo('evo-1-131k-base')
model, tokenizer = evo_model.model, evo_model.tokenizer
model.to(device)
model.eval()

# monkey patch the unembed function with identity
# this removes the final projection back from the embedding space into tokens
# so the "logits" of the model is now the final layer embedding
# see source for unembed - https://huggingface.co/togethercomputer/evo-1-131k-base/blob/main/model.py#L339

from torch import nn

class CustomEmbedding(nn.Module):
  def unembed(self, u):
    return u

model.unembed = CustomEmbedding()

# end custom code

sequence = 'ACGT'
input_ids = torch.tensor(
    tokenizer.tokenize(sequence),
    dtype=torch.int,
).to(device).unsqueeze(0)

embed, _ = model(input_ids) # (batch, length, embed dim)

print('Embed: ', embed)
print('Shape (batch, length, embed dim): ', embed.shape)

# you can now use embedding for downstream classification tasks
# you probably want to aggregate over position dimension
# e.g. mean value = embed.mean(dim=1) or final token embedding = embed[:, -1, :]

Note that this is for the model object returned by evo-model, which is an instance of StripedHyena. If you are using Huggingface directly, this is wrapped with StripedHyenaModelForCausalLM, so you need to do model.backbone.unembed = CustomEmbedding()

from evo.

seyonechithrananda commented on August 27, 2024 1

Thanks @davidkell !

from evo.

davidkell commented on August 27, 2024 1

I had a similar experience. I was able to get inference working for 2k sequences on A100 80GB (e.g. available on Paperspace), although around 2.5-3k I would get OOM. I haven't looked in depth on what is driving the memory requirement

from evo.

davidkell commented on August 27, 2024 1

Quoting from this issue #24:

Prompting with longer sequences requires sharding for the model, which is currently not supported

So I think if you want to generate embeddings for longer sequences, you will need to manually shard on GPUs or setup CPU offloading or something like that

from evo.

zhongwang commented on August 27, 2024

@davidkell I tried your code on a A100 40GB using the evo-8k model, embedding the 4-letter sequence in the example costs over 400MB GPU RAM, the model itself needs 13GB. The embedding dimension is 4096. I don't understand why it cost so much memory. 4x4096 BF16 should only take 32KB, right? I tried to embed a 2kb sequence but always ran out of cuda memory. Anyone has a similar problem?

from evo.

Extracting EVO representations rather than logits about evo HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent