Git Product home page Git Product logo

megdiffusion's Introduction

MegEngine

MegEngine is a fast, scalable, and user friendly deep learning framework with 3 key features.

  • Unified framework for both training and inference
    • Quantization, dynamic shape/image pre-processing, and even derivation with a single model.
    • After training, put everything into your model to inference on any platform with speed and precision. Check here for a quick guide.
  • The lowest hardware requirements
    • The memory usage of the GPU can be reduced to one-third of the original memory usage when DTR algorithm is enabled.
    • Inference models with the lowest memory usage by leveraging our Pushdown memory planner.
  • Inference efficiently on all platforms
    • Inference with speed and high-precision on x86, Arm, CUDA, and RoCM.
    • Supports Linux, Windows, iOS, Android, TEE, etc.
    • Optimize performance and memory usage by leveraging our advanced features.

Installation

NOTE: MegEngine now supports Python installation on Linux-64bit/Windows-64bit/MacOS(CPU-Only)-10.14+/Android 7+(CPU-Only) platforms with Python from 3.6 to 3.9. On Windows 10 you can either install the Linux distribution through Windows Subsystem for Linux (WSL) or install the Windows distribution directly. Many other platforms are supported for inference.

Binaries

To install the pre-built binaries via pip wheels:

python3 -m pip install --upgrade pip
python3 -m pip install megengine -f https://megengine.org.cn/whl/mge.html

Building from Source

How to Contribute

We strive to build an open and friendly community. We aim to power humanity with AI.

How to Contact Us

Resources

License

MegEngine is licensed under the Apache License, Version 2.0

Citation

If you use MegEngine in your publication,please cite it by using the following BibTeX entry.

@Misc{MegEngine,
  institution = {megvii},
  title =  {MegEngine:A fast, scalable and easy-to-use deep learning framework},
  howpublished = {\url{https://github.com/MegEngine/MegEngine}},
  year = {2020}
}

Copyright (c) 2014-2021 Megvii Inc. All rights reserved.

megdiffusion's People

Contributors

asthestarsfalll avatar chaibyte avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

megdiffusion's Issues

About padding in Downsample

I'm willing to upload my convert codes, but it doesn't work well after converting.
The error between megengine and pytorch implementation are high with the same input.
Because of the padding of convolution in Downsample are different, which in pytorch implementation it uses asymmetric padding.
Atfter I modified the megengine implmetation, the result:

class DownSample(M.Module):
    """"A downsampling layer with an optional convolution.

    Args:
        in_ch: channels in the inputs and outputs.
        use_conv: if ``True``, apply convolution to do downsampling; otherwise use pooling.
    """""

    def __init__(self, in_ch, with_conv=True):
        super().__init__()
        self.with_conv = with_conv
        if with_conv:
            self.main = M.Conv2d(in_ch, in_ch, 3, stride=2)
        else:
            self.main = M.AvgPool2d(2, stride=2)

    def _initialize(self):
        for module in self.modules():
            if isinstance(module, M.Conv2d):
                init.xavier_uniform_(module.weight)
                init.zeros_(module.bias)

    def forward(self, x, temb):  # add unused temb param here just for convince
        if self.with_conv:
            x = F.nn.pad(x, [*[(0, 0)
                         for i in range(x.ndim - 2)], (0, 1), (0, 1)])
        return self.main(x)

image

Btw, I'm also a beginner in ddpm, your blog helps me a lot!

Originally posted by @Asthestarsfalll in #5 (comment)

eps for GroupNorm

Great work!
The paramter 'eps' in group norm will be initialized to 1e-5 by default.
However, the group norm in TensorFlow has a little diference, which is initialized with 1e-6.
Maybe it doesn't have any influence on training results, but can you just modify this(for all GroupNorm in code) for aligning?
Because I want to convert the trained model from torch or tf to megengine, the less the error is, the better it is.

Handle with saving checkpoint failed

If the machine is preemptive, it might be scheduled to be preempted (or encounter other situations that cause the machine to go down). If the checkpoint is being saved at the exact moment, the original data will be corrupted. Therefore, it is reasonable to keep multiple backups locally. Considering the disk space occupancy, it is better to support cloud storage, such as supporting the use of AWS s3.

Gradient clipping issues in MegEngine v1.9.x

Description

Training with a single GPU & using gradient clipping in this codebase will cause an error in MegEngine 1.9.x version. After 1 iteration with auto diff & parameter update, the next time the model do forward will break. Error message:

RuntimeError: assertion `filter.ndim == img_ndim + 2 || filter.ndim == img_ndim + 4' failed at ../../../../../../imperative/src/impl/ops/convolution.cpp:61: megdnn::TensorLayout mgb::imperative::{anonymous}::convolution::do_shape_infer(const mgb::imperative::OpDef&, size_t, megdnn::TensorLayout, megdnn::TensorLayout)
extra message: bad filter ndim for dense convolution: spatial_ndim=2 filter_ndim=0

Here is the simplest example to reproduce this problem:

import megengine
import megengine.functional as F
import megengine.module as M
import megengine.optimizer as optim
import megengine.autodiff as autodiff

megengine.async_level = 0

class SimpleModel(M.Module):
    def __init__(self, in_ch):
        super().__init__()
        self.conv1 = M.Conv2d(in_ch, in_ch, 3, stride=1, padding=1)
        self.conv2 = M.Conv2d(in_ch, in_ch, 3, stride=1, padding=1)

    def forward(self, x):
        x = self.conv1(x)
        x = F.nn.interpolate(x, scale_factor=1, mode="nearest")
        x = self.conv2(x)
        return x

if __name__ == "__main__":
    x = F.ones((1, 1, 2, 2))
    model = SimpleModel(in_ch = 1)

    optimizer = optim.SGD(model.parameters(), lr=1e-3)
    gm = autodiff.GradManager()
    gm.attach(model.parameters())

    with gm:
        loss = model(x) + 0
        gm.backward(loss)
        
    optim.clip_grad_norm(model.parameters(), max_norm=1.)
    optimizer.step()
    y = model(x)

Workaround

  • Solution 1: Comment this line in megdiffusion.scripts.train:

    optim.clip_grad_norm(model.parameters(), FLAGS.grad_clip)

    Then we can train the model without clipping grad. ( But it's not expected... 😣 )

  • Solution 2: This situation did not happen when using distributed training.

  • Solution 3: Try changing loss = model(x) + 0 to loss = model(x) 🤔🤔🤔

  • Solution 4: Try deleting x = F.nn.interpolate(x, scale_factor=1, mode="nearest") 🤔🤔🤔

Issue Track

This problem was fixed in MegEngine/MegEngine@df5ebd3 so you can wait for the release of MegEngine v1.10 or build MegEngine dev latest than this commit from the source.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.