Comments (5)
a good template would be how the logs are accumulated here https://github.com/lucidrains/tf-bind-transformer/blob/main/tf_bind_transformer/training_utils.py additional helper functions can be brought in for "maybe transforms" on certain keys in the log
from dalle2-pytorch.
What's the overall idea about being experimenter tracker agnostic?
Do you want to support other trackers or do you mostly want to be able to disable it?
Regarding distributed training, i figure there's 2 things to support:
- Logging only on node=0
- not even logging on node 0, but let nodes report through some custom way (eg the disk), so some other node (eg a login node) can retrieve that information and log to the tracker (this is for example needed on juwels where compute nodes don't have access to the internet)
How would you want to implement this ? What's the main goal ?
from dalle2-pytorch.
@rom1504 both support other trackers and be able to disable. i've done this successfully for some other projects by now - here is an example of what i have for https://github.com/lucidrains/video-diffusion-pytorch
import wandb
wandb.init(project = 'video-diffusion')
wandb.run.name = 'resnet'
wandb.run.save()
trainer = Trainer(
diffusion,
'/home/phil/dl/nuwa-pytorch/gif-moving-mnist/',
results_folder = './results-new-focus-present',
train_batch_size = 4,
train_lr = 2e-5,
save_and_sample_every = 1000,
max_grad_norm = 0.5,
train_num_steps = 700000, # total training steps
gradient_accumulate_every = 8, # gradient accumulation steps
ema_decay = 0.995, # exponential moving average decay
amp = True # turn on mixed precision
)
trainer.load(-1)
def log_fn(log):
if 'sample' in log:
log['sample'] = wandb.Video(log['sample'])
wandb.log(log)
trainer.train(log_fn = log_fn, prob_focus_present = 0.)
from dalle2-pytorch.
the log_fn
can be made more composable for sure, as you may want to exclude certain keys from being logged, wrap other ones, derive other keys from available ones in the set etc
from dalle2-pytorch.
started 89de5af
from dalle2-pytorch.
Related Issues (20)
- How to combine the huggingface pretrained models with you repository?
- How to change TimeSteps? When I try to change and run, it warns.(Help pls, it's important to me)
- some questions about scaling
- Custom Dalle-2 trained decoder generating random noise
- some interesting results HOT 4
- About the weight file of prior HOT 1
- Inference time is longer than training time HOT 1
- Custom Clip in Decoder HOT 1
- Classifier-Free Guidance Formulation HOT 3
- Can I use a pretrained clip model from huggingface?
- Field missing in config.
- Field required in train_decoder_example config HOT 16
- HELP: Training conditional generation on my own dataset HOT 1
- train datasets
- Loss NaN when training CLIP HOT 5
- "Nan" values in loss.
- How to apply CoCa (open-clip weights)
- Use the code in the Usage section of the README,but error has occurred. HOT 1
- Train on custom input -> noise as output HOT 7
- The resulting images are of poor quality HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dalle2-pytorch.