Overall goal: Standardise pytorch code among deep learning research to allow researchers to focus on developing models An advice given to me by a PhD student studying at CMU with regards to being a good deep learning research is to first be able to learn how to write good code in deep learning
Important aspects for clean code
- Reproducibility
- Expand section on distributed programming
- setting up the environment section
- virtual environment
- requirements.txt
- allowing users to install easily
- improving the development process
- commit messages
- code styles
- Create auto generator for new NLP project with pytorch
- Write down the steps when running a new NLP project
- Find a repository as an example
- Write an article on medium on GLIP implementation and code structure
- list possible books on good code structure
- Read up on pytorch 2.0 features
- Find ways to optimize github repository
This is a repository documenting clean code guidelines for developing pytorch models
Links to repository with clean code:
- Pytorch documentation: https://pytorch.org/docs/stable/index.html
- Pytorch template: https://github.com/victoresque/pytorch-template
Dataset template card: https://github.com/huggingface/datasets/blob/main/templates/README_guide.md
- config folder: Contains a set of yaml files for your dataset configurations
- examples
- use distributed data parallelism: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html#initialize-ddp-with-torch-distributed-run-torchrun
- to date, command is torchrun --nnodes=2 --nproc_per_node=8
If you have secrets such as API keys to use in your repository, you can create a .env file and place it under your gitignore file, you should also provide an env.example file to provide a template for your original env file https://dev.to/edgar_montano/how-to-setup-env-in-python-4a83#:~:text=How%20to%20setup%20a%20.env%20file%201%201.To,file%20using%20the%20following%20format%3A%20...%20More%20items
Steps:
- pip install python-dotenv file
- create a .env file and a .env.example file
- Add the .env to your gitignore
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
- Colorama
examples:
from colorama import Fore print(Fore.RED) + "text message here in red"