dlsyscourse / hw4 Goto Github PK

View Code? Open in Web Editor NEW

1.0 3.0 11.0 223 KB

CMake 1.16% Makefile 0.11% Python 68.12% Jupyter Notebook 20.23% C++ 4.69% Cuda 5.68%

hw4's Introduction

Homework 4

Public repository and stub/testing code for Homework 4 of 10-714.

hw4's People

Contributors

Stargazers

Watchers

Forkers

maxianren abasrith95 dlcourse-learning a1194266996 yuki-younai eveerf yofufufufu ncy-co reblexis xjh42 xubbbb

hw4's Issues

wrong acc, loss return in `one_iter_of_cifar10_training`

hw4/tests/test_conv.py

Line 489 in d6ed035

return correct/(y.shape[0]*niter), total_loss/(y.shape[0]*niter)

    for batch in dataloader:
        opt.reset_grad()
        X, y = batch
        X,y = ndl.Tensor(X, device=device), ndl.Tensor(y, device=device)
        out = model(X)
        correct += np.sum(np.argmax(out.numpy(), axis=1) == y.numpy())
        loss = loss_fn(out, y)
        total_loss += loss.data.numpy() * y.shape[0]
        loss.backward()
        opt.step()
        if i >= niter:
            break
        i += 1
    return correct/(y.shape[0]*niter), total_loss/(y.shape[0]*niter)

correct/(y.shape[0]*niter), total_loss/(y.shape[0]*niter) only correct if the size of the last batch is equal to the others. It's maybe not true for all dataset.

Type weirdness in ResNet9 submission code

hw4/tests/hw4/test_conv.py

Lines 471 to 483 in 1ef8c3b

 def one_iter_of_cifar10_training(dataloader, model, niter=1, loss_fn=ndl.nn.SoftmaxLoss(), opt=None, device=None): 

 np.random.seed(4) 

 model.train() 

 correct, total_loss = 0, 0 

 i = 1 

 for batch in dataloader: 

 opt.reset_grad() 

 X, y = batch 

 X,y = ndl.Tensor(X, device=device), ndl.Tensor(y, device=device) 

 out = model(X) 

 correct += np.sum(np.argmax(out.numpy(), axis=1) == y.numpy()) 

 loss = loss_fn(out, y) 

 total_loss += loss.data.numpy() * y.shape[0]

Is it intended that correct is a float scalar and total_loss is a np array? They're heterogenous but later fed into ndl.Tensor ctor, leading to Error.

The initialization order of modules is crucial in resnet9

The initialization order of modules is crucial and should be explicitly stated.

Due to the random seed set in the test file, the initialization order of different modules is important when constructing the ResNet9 network. It is recommended to clarify this in the documentation of the homework.

For ResNet9:
Correct:

class ResNet9(ndl.nn.Module):
    def __init__(self, device=None, dtype="float32"):
        super().__init__()

        self.module = nn.Sequential(
            ConvBN(3, 16, 7, 4, device=device, dtype=dtype),
            ConvBN(16, 32, 3, 2, device=device, dtype=dtype),
            ndl.nn.Residual(nn.Sequential(
                ConvBN(32, 32, 3, 1, device=device, dtype=dtype),
                ConvBN(32, 32, 3, 1, device=device, dtype=dtype),
            )),

            ConvBN(32, 64, 3, 2, device=device, dtype=dtype),
            ConvBN(64, 128, 3, 2, device=device, dtype=dtype),
            ndl.nn.Residual(nn.Sequential(
                ConvBN(128, 128, 3, 1, device=device, dtype=dtype),
                ConvBN(128, 128, 3, 1, device=device, dtype=dtype),
            )),

            nn.Flatten(),
            nn.Linear(128, 128, device=device, dtype=dtype),
            nn.ReLU(),
            nn.Linear(128, 10, device=device, dtype=dtype),
        )

Incorrect (ResNet 9, train_cifar19 will fail):

class ResNet9(ndl.nn.Module):
    def __init__(self, device=None, dtype="float32"):
        super().__init__()

        residual1 = ndl.nn.Residual(nn.Sequential(
            ConvBN(32, 32, 3, 1, device=device, dtype=dtype),
            ConvBN(32, 32, 3, 1, device=device, dtype=dtype),
        ))

        residual2 = ndl.nn.Residual(nn.Sequential(
            ConvBN(128, 128, 3, 1, device=device, dtype=dtype),
            ConvBN(128, 128, 3, 1, device=device, dtype=dtype),
        ))

        self.module = nn.Sequential(
            ConvBN(3, 16, 7, 4, device=device, dtype=dtype),
            ConvBN(16, 32, 3, 2, device=device, dtype=dtype),
            residual1,

            ConvBN(32, 64, 3, 2, device=device, dtype=dtype),
            ConvBN(64, 128, 3, 2, device=device, dtype=dtype),
            residual2,

            nn.Flatten(),
            nn.Linear(128, 128, device=device, dtype=dtype),
            nn.ReLU(),
            nn.Linear(128, 10, device=device, dtype=dtype),
        )

def __pow__(self, other):
        if isinstance(other, Tensor):
            return needle.ops.EWisePow()(self, other)
        else:
            return needle.ops.PowerScalar(other)(self)

def __sub__(self, other):
        if isinstance(other, Tensor):
            return needle.ops.EWiseAdd()(self, needle.ops.Negate()(other))
        else:
            return needle.ops.AddScalar(-other)(self)


    __radd__ = __add__
    __rmul__ = __mul__
    __rsub__ = __sub__
    __rmatmul__ = __matmul__

dlsyscourse / hw4 Goto Github PK

hw4's Introduction

Homework 4

hw4's People

Contributors

Stargazers

Watchers

Forkers

hw4's Issues

wrong acc, loss return in `one_iter_of_cifar10_training`

Type weirdness in ResNet9 submission code

The initialization order of modules is crucial in resnet9

channel num doesn't mattch in ResNet9.png

test_flip_forward in hw4 doesn't check the fliped array, but the original array

It seems that the subtraction operator for tensors has an issue.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	def one_iter_of_cifar10_training(dataloader, model, niter=1, loss_fn=ndl.nn.SoftmaxLoss(), opt=None, device=None):
	np.random.seed(4)
	model.train()
	correct, total_loss = 0, 0
	i = 1
	for batch in dataloader:
	opt.reset_grad()
	X, y = batch
	X,y = ndl.Tensor(X, device=device), ndl.Tensor(y, device=device)
	out = model(X)
	correct += np.sum(np.argmax(out.numpy(), axis=1) == y.numpy())
	loss = loss_fn(out, y)
	total_loss += loss.data.numpy() * y.shape[0]