Git Product home page Git Product logo

Comments (5)

wawpaopao avatar wawpaopao commented on May 12, 2024 1

thanks!

from hyena-dna.

wawpaopao avatar wawpaopao commented on May 12, 2024

and what's difference between huggingface trainer provided in colab? when i use colab to finetuning hyenadna in nucleotide transformer dataset, the performence is a bit low....

from hyena-dna.

exnx avatar exnx commented on May 12, 2024

In your first post, the correct flag is: dataset.dataset_name. You forgot the prefix dataset before dataset_name.

Regarding your second post, the main differences are:

  • lack of cosine decay learning rate scheduler (very important)
  • lack of optimizer parameter groups, that allow allow for different hyperparameters per layer (eg, the Hyena layer requires weight decay to be 0). Also very important!
  • using gradient clip (with value of 1.0)
  • automatic mixed precision handling in Pytorch Lighting (eg, not available on colab)

from hyena-dna.

wawpaopao avatar wawpaopao commented on May 12, 2024

Thanks,I use colab file to finetune..like DNABRET2 dataset ,just the same as nucleotide dataset. I found the mcc was a bit low . I don't know the error... could you help me take a look?
def run_train():

# experiment settings:
num_epochs = 80  # ~100 seems fine
max_length = 500  # max len of sequence of dataset (of what you want)
use_padding = True
data_path = './DNABERT_2/eval/GUE/EMP/H3'

batch_size = 256
learning_rate = 6e-4  # good default for Hyena
rc_aug = True  # reverse complement augmentation
add_eos = False  # add end of sentence token
weight_decay = 0.1

# for fine-tuning, only the 'tiny' model can fit on colab
pretrained_model_name = 'hyenadna-tiny-1k256d-seqlen'  # use None if training from scratch

# we need these for the decoder head, if using
use_head = True
n_classes = 1

# you can override with your own backbone config here if you want,
# otherwise we'll load the HF one by default
backbone_cfg = None

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print("Using device:", device)

# instantiate the model (pretrained here)
if pretrained_model_name in ['hyenadna-tiny-1k256d-seqlen']:
    # use the pretrained Huggingface wrapper instead
    model = HyenaDNAPreTrainedModel.from_pretrained(
        './checkpoints',
        pretrained_model_name,
        download=False,
        config=backbone_cfg,
        device=device,
        use_head=use_head,
        n_classes=n_classes,
    )

# from scratch
else:
    model = HyenaDNAModel(**backbone_cfg, use_head=use_head, n_classes=n_classes)
 # create tokenizer
tokenizer = CharacterTokenizer(
    characters=['A', 'C', 'G', 'T', 'N'],  # add DNA characters, N is uncertain
    model_max_length=max_length + 2,  # to account for special tokens, like EOS
    add_special_tokens=False,  # we handle special tokens elsewhere
    padding_side='left', # since HyenaDNA is causal, we pad on the left
)

# create datasets
ds_train = SupervisedDataset(tokenizer=tokenizer,
                             data_path = os.path.join(data_path, "train.csv"),
                             kmer = -1)
ds_test = SupervisedDataset(tokenizer=tokenizer,
                            data_path= os.path.join(data_path,'test.csv'),
                            kmer = -1)
train_loader = DataLoader(ds_train, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(ds_test, batch_size=batch_size, shuffle=False)

# loss function
loss_fn = nn.BCEWithLogitsLoss()
#loss_fn = nn.MSELoss()
# create optimizer
optimizer = optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

model.to(device)

for epoch in range(num_epochs):
    train(model, device, train_loader, optimizer, epoch, loss_fn)
    test(model, device, test_loader, loss_fn)
    optimizer.step()

if name == "main":
run_train()

from hyena-dna.

exnx avatar exnx commented on May 12, 2024

As mentioned, the colab is missing a lot of stuff to get competitive results. The colab is for education purposes mainly. To get good results, you'll need to use the main repo for finetuning. Also, the hyperparameters will matter a lot too, which is something only you will find by running sweeps of different hyperparameters on the actual datasets.

Unfortunately we're not able to support you in finetuning on your own different datasets. We mainly support on reproducing results from the paper.

But maybe this new docker image will help, which has the Nucleotide Transformer datasets and the exact launch commands and hyperparameters used to get best performance. The environment, pretrained weights, datasets, launch commands are all inside the Docker image, you just need to pull and launch it. Perhaps you can "reverse engineer" those settings for what works best for you on your own datasets.

docker pull hyenadna/hyena-dna-nt6:latest 
docker run --gpus all -it -p80:3000 hyenadna/hyena-dna-nt6 /bin/bash

This will land you inside the /wdr, which has a file named launch_commands_nucleotide_transformer with all the launch commands for the 18 NT datasets.

from hyena-dna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.