Git Product home page Git Product logo

rocmsoftwareplatform / pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pytorch/pytorch

221.0 26.0 51.0 856.31 MB

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Home Page: http://pytorch.org

License: Other

Shell 0.38% Python 55.87% CMake 0.71% Makefile 0.01% C++ 36.22% C 1.51% Cuda 3.04% Objective-C 0.03% Objective-C++ 1.26% CSS 0.01% HTML 0.01% Batchfile 0.02% Dockerfile 0.05% PowerShell 0.01% Java 0.12% Assembly 0.30% Ruby 0.01% Starlark 0.29% GLSL 0.18% GDB 0.01%
pytorch rocm

pytorch's Issues

[Caffe2] Docker image and Documentation

The goal is to create an all-in-one docker image and documentation for Pytorch/Caffe2.

The requirement:

Target audience: noob user who does not know anything.
Coverage: Include build and install instructions for C2 as well as its dependencies.
Assumption: that the user will have ROCm 1.8 installed on their system on Vega10.

[Pytorch] Tensor tutorial examples hang

πŸ› Bug

Trying to execute this examples from Pytorch tutorial (https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-tensors, https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-defining-new-autograd-functions) hangs after first iteration. Others like this work https://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-tensors-and-autograd

# -*- coding: utf-8 -*-

import torch


dtype = torch.float
device = torch.device("cuda:0")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random input and output data
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Randomly initialize weights
w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)

    # Compute and print loss
    loss = (y_pred - y).pow(2).sum().item()
    print(t, loss)

    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h < 0] = 0
    grad_w1 = x.t().mm(grad_h)

    # Update weights using gradient descent
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

Environment

- pytorch build on top of docker image rocm/pytorch:rocm1.9.2
- PyTorch version: 1.0.0a0+ee1f7b8
- OS: Ubuntu 18.04.1 LTS
- GPU VegaFE, amdgpu, 1.9-307, 4.15.0-42-generic, x86_64: installed
- CMake version: version 3.6.3
- Python version: 2.7

[Caffe2] upstream tracker

The goal is to create a tracker on the upstream project.
Whenever anything cu. change (a potential break to pyHipify process), we will get notified.

Landing Zone

  • must-have: daily track
  • nice-to-have: real-time tracking

[Caffe2] rocprim ops

The goal is to enable rocprim ops on the project. Also, create the new rule in pyhipify.

Better Provisioning for AMD CI node

The goal is to develop a strategy to properly provisioning AMD CI node for Pytorch/Caffe2.

Recently one of the CI workers was silently updated to Linux kernel 4.15 and hence cripple rocm stack.
One of the unit tests is failing due to this and gate upstream PR merge. As we are adding more AMD nodes to the CI pool; need to develop a protocol to provision those nodes.

Minor format error terminates the unit-test run.

πŸ› Bug

To Reproduce

  1. Use option 3 from here https://github.com/ROCmSoftwarePlatform/pytorch/wiki/Building-PyTorch-for-ROCm

  2. In step 5, there's a choice (not sure why) to use one of two repos. Use the first one.
    git clone https://github.com/pytorch/pytorch.git or
    git clone https://github.com/ROCmSoftwarePlatform/pytorch.git

  3. Once built, run the unit-tests with the command below.

PYTORCH_TEST_WITH_ROCM=1 python test/run_test.py –verbose

======================================================================
FAIL: test_print (test_torch.TestTorch)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/pytorch/test/test_torch.py", line 8906, in test_print
    self.assertExpectedInline(str(x), '''tensor([1.0000e+28, 1.0000e-28])''')
  File "/data/pytorch/test/expecttest.py", line 195, in assertExpectedInline
    self.assertMultiLineEqual(expect, actual, msg=help_text)
AssertionError: 'tensor([1.0000e+28, 1.0000e-28])' != 'tensor([1.00000e+28, 1.00000e-28])'
- tensor([1.0000e+28, 1.0000e-28])
+ tensor([1.00000e+28, 1.00000e-28])
?               +        +
 : To accept the new output, re-run test with envvar EXPECTTEST_ACCEPT=1 (we recommend staging/committing your changes before doing this)

[PyTorch] Docker Image and Documentation

Create documentation for the open source community on how to start working with the ROCm PyTorch Docker Image. Provide a clear summary on what's supported at the moment and what's on the agenda.

Distributed Traning

The goal is to evaluate the high-level approach to enable Pytorch/caffe2 distributed training

Build error: THCStorage.cu:4:10: fatal error: 'thrust/device_ptr.h' file not found

πŸ› Bug

I'm trying to build pytorch for ROCm, and it fails with this log:

[ 67%] Building CXX object modules/module_test/CMakeFiles/caffe2_module_test_dynamic.dir/module_test_dynamic.cc.o
[ 67%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state.cc.o
[ 67%] Building CXX object caffe2/CMakeFiles/caffe2_pybind11_state.dir/python/pybind_state_dlpack.cc.o
[ 67%] Building HIPCC object caffe2/CMakeFiles/caffe2_hip.dir/__/aten/src/THC/caffe2_hip_generated_THCStorage.cu.o
/home/john/git/pytorch-rocmfork/aten/src/THC/THCStorage.cu:4:10: fatal error: 'thrust/device_ptr.h' file not found
#include <thrust/device_ptr.h>
^~~~~~~~~~~~~~~~~~~~~
1 error generated.
/home/john/git/pytorch-rocmfork/aten/src/THC/THCStorage.cu:4:10: fatal error: 'thrust/device_ptr.h' file not found
#include <thrust/device_ptr.h>
^~~~~~~~~~~~~~~~~~~~~
1 error generated.
CMake Error at /opt/rocm/hip/cmake/FindHIP/run_make2cmake.cmake:18 (file):
file failed to open for reading (No such file or directory):

/home/john/git/pytorch-rocmfork/build/caffe2/CMakeFiles/caffe2_hip.dir/__/aten/src/THC/caffe2_hip_generated_THCStorage.cu.o.depend.pre

CMake Error at caffe2_hip_generated_THCStorage.cu.o.cmake:134 (message):
Error generating
/home/john/git/pytorch-rocmfork/build/caffe2/CMakeFiles/caffe2_hip.dir/__/aten/src/THC/./caffe2_hip_generated_THCStorage.cu.o

make[2]: *** [caffe2/CMakeFiles/caffe2_hip.dir/build.make:22908: caffe2/CMakeFiles/caffe2_hip.dir/__/aten/src/THC/caffe2_hip_generated_THCStorage.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3395: caffe2/CMakeFiles/caffe2_hip.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

To Reproduce

Steps to reproduce the behavior:

  1. Install ROCm / libraries from apt repo
  2. install rocmSPARSE and hipSPARSE from source
  3. clone rocm pytorch repo
  4. python3 tools/amd_build/build_pytorch_amd.py
  5. USE_ROCM=1 python3 setup.py build

Expected behavior

I expect pytorch to compile.

Environment

Ubuntu 18.10.
-- Using python found in /usr/bin/python3
HIP VERSION: 1.5.18353

***** Library versions from dpkg *****

hsakmt-roct VERSION: 1.0.9-8-g238782c
hsakmt-roct-dev VERSION: 1.0.9-8-g238782c
hsa-ext-rocr-dev VERSION: 1.1.9-9-ge4ab040
hsa-rocr-dev VERSION: 1.1.9-9-ge4ab040
hcc VERSION: 1.2.18354
hip_base VERSION: 1.5.18353
hip_hcc VERSION: 1.5.18353

***** Library versions from cmake find_package *****

rocrand VERSION: 1.8.1
hiprand VERSION: 1.8.1
rocblas VERSION: 0.14.2.4
miopen VERSION: 1.5.0-e1f0433
miopengemm VERSION: 1.1.5-9547fb9
rocfft VERSION: 0.8.6.0
hipsparse VERSION: 0.1.3.2
rocsparse VERSION: 0.1.3.2
ROCm is enabled.

NOTE: I do NOT have the ubuntu package libthrust-dev installed.. if I do, then it fails with other errors saying cuda isn't installed.

Additional context

Have at least some rocm demos working on my HP EliteBook 745 G5, Ryzen 5 PRO 2500U. Thought I'd see if I can use pytorch with rocm yet for my ML projects.

Building PyTorch with ROCm

❓ Questions and Help

Please note that this issue tracker is not a help form and this issue will be closed.

I'm trying to build PyTorch to run on ROCm (Ubuntu 18.04) and am having issues. I tried the following.

  1. I followed https://github.com/ROCmSoftwarePlatform/pytorch/wiki/Building-PyTorch-for-ROCm but it seems to have failed at pyyaml (https://gist.github.com/briansp2020/114bd75ff0182197cf7efc7af265e89c)
    I got over the error by installing wheel. However, the build still failed later (https://gist.github.com/briansp2020/2719353d626968082410011dc36608cf)

  2. I tried build it in tensorflow docker and I get https://gist.github.com/briansp2020/2a109c0f1d40b45299cb73a76a255767

It seems the wiki is old and I needed to get latest rocSPARSE (https://github.com/ROCmSoftwarePlatform/rocSPARSE/releases) to get past the CMake phase. Unfortunately, build still failed(https://gist.github.com/briansp2020/52047cf73d8d59ddd72f730d779b952c)...

Do you have up to date instruction on how to build PyTorch with ROCm? My goal is to run fast.ai on Vega FE with ROCm.

Thanks!

[Caffe2] GPU memory access fault while running OverFeat benchmark for batch size 1 and 4

python ../../caffe2/python/convnet_benchmarks.py --batch_size 1 --model OverFeat --net_type simple --layer_wise_benchmark True 2>&1 | tee caffe2_overfeat_bs1.txt
I0801 15:38:58.072324 30959 net_simple.cc:101] Starting benchmark.
I0801 15:38:58.072343 30959 net_simple.cc:102] Running warmup runs.
Memory access fault by GPU node-1 on address 0x524400000. Reason: Page not present or supervisor privilege.
OverFeat: running forward-backward.
*** Aborted at 1533163139 (unix time) try "date -d @1533163139" if you are using GNU date ***
PC: @ 0x7f307508d428 gsignal
*** SIGABRT (@0x3e8000078ef) received by PID 30959 (TID 0x7f3023c7c700) from PID 30959; stack trace: ***
@ 0x7f3075433390 (unknown)
@ 0x7f307508d428 gsignal
@ 0x7f307508f02a abort
@ 0x7f30423e5155 (unknown)
@ 0x7f30423ebafd (unknown)
@ 0x7f30423b6817 (unknown)
@ 0x7f30754296ba start_thread
@ 0x7f307515f41d clone
@ 0x0 (unknown)

[Pytorch] AMD GPUs benchmarks

Hi. Thanks for this work guys.

I was curious as to whether you had been able to bench the framework on amd gpus ? I've successfully build pytorch with rocm support following your instructions, and the benchs I got don't seem right. I'm testing with a Radeon 580, which should be like half the performance as 1080 Ti, and I'm seeing more like 9-10 times drop in performances on convolution. The tensorflow benchs already show that the gap shouldn't be that wide.

Is this supposed to be normal for the moment ?

Building PyTorch w/o Docker ?

Hi, Im trying to get my AMD system set up to run some torch software , I prefer not to have to mess with Docker, is there a reason to do this ?

Is there a way to build this w/o docker?

"torch._C._cuda_getDevice()" fails in Python3 but succeeds in Python2

Python3

root@0e76836e0bcf:/data/pytorch/examples/rl_a3c_pytorch# python3
Python 3.5.2 (default, Nov 12 2018, 13:43:14)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch._C._cuda_getDevice()
THCudaCheck FAIL file=/pytorch/torch/csrc/cuda/Module.cpp line=53 error=35 : CUDA driver version is insufficient for CUDA runtime version
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at /pytorch/torch/csrc/cuda/Module.cpp:53
>>> quit()

Python2

root@0e76836e0bcf:/data/pytorch/examples/rl_a3c_pytorch# python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch._C._cuda_getDevice()
0L
>>> quit()

Python3 was built as follows:

First installed the following:

apt-get install python3-dev
apt-get install -y python3-pip
alias python=python3

then built with no changes

.jenkins/pytorch/build.sh

Multi-GPU support

Currently, we do not support multi-GPU on ROCm, nor do we assume it works right now. This issue is tracking insights and progress as we are trying to enable this.

[Caffe2] Cudnn operator update

The goal is to review cudnn implementation of ops and implement miopen version if applicable

  • affine_channel
  • sigmoid/tanh
  • transpose
  • dropout
  • depthwise_3x3_conv

Need recipe for integrating custom CUDA kernels

πŸ“š Documentation

The PyTorch-based Faster R-CNN model use a few special CUDA kernels such as NMS, ROI_Pooing, ROI_Align and ROI_Crop.

The integration steps under CUDA are available here

For ROCm integration, I'm guessing the first step is hipification.

/opt/rocm/hip/bin/hipify-perl nms_cuda_kernel.cu > nms_hip_kernel.cpp

What's next? PyTorch 1.0 related packaging?
Kindly provide instructions for the rest of the steps for PyTorch 1.0

The instructions that would perhaps replace the code snippets below.

from torch.utils.cpp_extension import CUDAExtension

I've tried a few things but the above import seem hard-wired to CUDA.

if torch.cuda.is_available() and CUDA_HOME is not None:
     extension = CUDAExtension
     sources += source_cuda
     define_macros += [("WITH_CUDA", None)]
     extra_compile_args["nvcc"] = [
         "-DCUDA_HAS_FP16=1",
         "-D__CUDA_NO_HALF_OPERATORS__",
         "-D__CUDA_NO_HALF_CONVERSIONS__",
         "-D__CUDA_NO_HALF2_OPERATORS__",
     ]

   ext_modules = [
        extension(
            "model._C",
            sources,
            include_dirs=include_dirs,
            define_macros=define_macros,
            extra_compile_args=extra_compile_args,
        )
    ]

Cannot train on gfx803

πŸ› Bug

Compiling PyTorch in the rocm/pytorch:rocm2.1 docker, I'm getting a ton of warning: loop not unrolled printing out. I don't see them in any of your CI output or other snippets posted here, so I wondered if this might be the reason for my problems. I have three tests failing, two with errors similar to another open issue, and neural network training isn't working for me.

In the PyTorch beginning tutorial, there are no errors, but the network is clearly not being trained:

[1,  2000] loss: 2.304
[1,  4000] loss: 2.303
[1,  6000] loss: 2.303
[1,  8000] loss: 2.303
[1, 10000] loss: 2.303
[1, 12000] loss: 2.304
[2,  2000] loss: 2.303
[2,  4000] loss: 2.303
[2,  6000] loss: 2.303
[2,  8000] loss: 2.304
[2, 10000] loss: 2.304
[2, 12000] loss: 2.303
Finished Training

Just to be clear, the loss function should converge towards 1.0, and does when run via CPU.

My PyTorch is at least partly working - I've been using it to run https://github.com/xinntao/ESRGAN, and the results are clearly superior to running via CPU. I have no idea if I'm doing something wrong with the compile or there's a bug somewhere, but it seems to be training rather than executing that is broken.

Environment

rocm/pytorch:rocm2.1 docker after apt full-update. Host: Ubuntu 18.10, Ryzen 5 1600x, 16GB RAM. I've tried both lowering MAX_JOBS and creating a large swap file to avoid memory issues, but none of that affects the errors.

Here's everything from your environment script that got a value:

PyTorch version: 1.1.0a0+c751cf8
Is debug build: No

OS: Ubuntu 16.04.5 LTS
CMake version: version 3.6.3

Python version: 2.7
Is CUDA available: Yes

Versions of relevant libraries:
[pip] numpy==1.15.4
[pip] torch==1.1.0a0+c751cf8
[pip] torchvision==0.2.1

GPU

R9 Fury, target gfx803. I wonder if using an older, non-default target may be part of my problem. I understand older GPUs naturally receive less focus, though I hope you'll be able to look at it if there is a gfx803 issue.

Output

Example warning:

In file included from /data/development/rocm-pytorch/aten/src/THH/THHTensorSort.cuh:8:
/data/development/rocm-pytorch/aten/src/THH/THHSortUtils.cuh:141:1: 
warning: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Wpass-failed=transform-warning]

Test Output:

======================================================================
FAIL: test_broadcast_batched_matmul (test_cuda.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/development/rocm-pytorch/test/common_utils.py", line 296, in wrapper
    method(*args, **kwargs)
  File "/data/development/rocm-pytorch/test/test_cuda.py", line 2218, in test_broadcast_batched_matmul
    _TestTorchMixin._test_broadcast_batched_matmul(self, lambda t: t.cuda())
  File "/data/development/rocm-pytorch/test/test_torch.py", line 3760, in _test_broadcast_batched_matmul
    verify_batched_matmul(*indices)
  File "/data/development/rocm-pytorch/test/test_torch.py", line 3752, in verify_batched_matmul
    self.assertEqual(truth, maybe_squeeze_result(l, r, out))
  File "/data/development/rocm-pytorch/test/common_utils.py", line 427, in assertEqual
    assertTensorsEqual(x, y)
  File "/data/development/rocm-pytorch/test/common_utils.py", line 408, in assertTensorsEqual
    self.assertTrue(torch.equal(nan_mask, torch.isnan(b)), message)
AssertionError: False is not true : 

======================================================================
FAIL: test_broadcast_fused_matmul (test_cuda.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/development/rocm-pytorch/test/common_utils.py", line 296, in wrapper
    method(*args, **kwargs)
  File "/data/development/rocm-pytorch/test/test_cuda.py", line 2215, in test_broadcast_fused_matmul
    _TestTorchMixin._test_broadcast_fused_matmul(self, lambda t: t.cuda())
  File "/data/development/rocm-pytorch/test/test_torch.py", line 3689, in _test_broadcast_fused_matmul
    self.assertEqual(r0, r1)
  File "/data/development/rocm-pytorch/test/common_utils.py", line 427, in assertEqual
    assertTensorsEqual(x, y)
  File "/data/development/rocm-pytorch/test/common_utils.py", line 419, in assertTensorsEqual
    self.assertLessEqual(max_err, prec, message)
AssertionError: tensor(9., device='cuda:0', dtype=torch.float32) not less than or equal to 1e-05 : 

======================================================================
FAIL: test_randperm_cuda (test_cuda.TestCuda)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/data/development/rocm-pytorch/test/common_utils.py", line 296, in wrapper
    method(*args, **kwargs)
  File "/data/development/rocm-pytorch/test/test_cuda.py", line 2513, in test_randperm_cuda
    self.assertEqual(res1, res2, 0)
  File "/data/development/rocm-pytorch/test/common_utils.py", line 427, in assertEqual
    assertTensorsEqual(x, y)
  File "/data/development/rocm-pytorch/test/common_utils.py", line 419, in assertTensorsEqual
    self.assertLessEqual(max_err, prec, message)
AssertionError: tensor(9223372036854775492, device='cuda:0') not less than or equal to 0 : 

----------------------------------------------------------------------
Ran 150 tests in 7.430s

FAILED (failures=3, skipped=92)

[Detectron] Incorrect use of max and abs

After enabling Detectron files hipification and building in PR #295, there are warnings while building the project from the following files:
sigmoid_focal_loss_op_hip.cc
ps_roi_pool_op_hip.cc
smooth_l1_loss_op_hip.cc
The warnings are because HIP does not overload max and abs functions.

Please add appropriate checks #if defined (__HIP_PLATFORM_HCC__) and use more specific HIP functions like fmaxf and fabsfin the corresponding CUDA files.

[PyTorch] Investigate early runtime error.

Currently, when using PyTorch with ROCm, you'll notice the following error:

import torch
torch.Tensor(1).cuda()

RuntimeError: torch.cuda.sparse.FloatTensor is not enabled.

However, the error disappears by executing torch.cuda._lazy_init() very early.

import torch
torch.cuda._lazy_init()
torch.Tensor(1).cuda()
tensor([ 0], device='cuda:0')

Why don't we have a PyTorch-ROCM pip package like tensorflow?

πŸš€ Feature

PyTorch ROCM package as a pip package like tensorflow-rocm would be great.

Motivation

While it is already a pain for newer users to get things up and running, pytorch installation for rocm platform is just a lot for newer users. Since there is a tensorflow-rocm package for new users to easily download and install, I think PyTorch should have it too for the users who prefer pytorch over tensorflow.

Pitch

PyTorch ROCM package as a pip package like tensorflow-rocm would be great.

Alternatives

A conda package would be great too

Aten missing files for caffe2_hip

Issue description

Aten appears to me missing files for caffe2_hip and prevents the installation of torch

In file included from /home/user/dev/pytorch/aten/src/THC/THCTensorIndex.cu:12:
/home/user/dev/pytorch/aten/src/THC/THCAtomics.cuh:145:35: error: static declaration of 'atomicAdd' follows non-static declaration
static inline device void atomicAdd(double address, double val) { }
^
/opt/rocm/hip/include/hip/hcc_detail/hip_atomic.h:73:8: note: previous definition is here
double atomicAdd(double
address, double val)
^
1 error generated.
[100%] Linking HIP shared library ../lib/libcaffe2_hip.so
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THC/caffe2_hip_generated_THCTensorIndex.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THC/caffe2_hip_generated_THCTensorScatterGather.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_FeatureLPPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_IndexLinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_SpatialAdaptiveAveragePooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_SpatialAdaptiveMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_SpatialClassNLLCriterion.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_SpatialFractionalMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_SpatialGridSamplerBilinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_SpatialReflectionPadding.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_SpatialReplicationPadding.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_SpatialSubSampling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_SpatialUpSamplingBilinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_SpatialUpSamplingNearest.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_TemporalMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_TemporalReflectionPadding.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_TemporalReplicationPadding.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_TemporalUpSamplingLinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_TemporalUpSamplingNearest.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_VolumetricAdaptiveAveragePooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_VolumetricAdaptiveMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_VolumetricAveragePooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_VolumetricDilatedMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_VolumetricFractionalMaxPooling.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_VolumetricGridSamplerBilinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_VolumetricReplicationPadding.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/THCUNN/caffe2_hip_generated_VolumetricUpSamplingNearest.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/THCUNN/caffe2_hip_generated_VolumetricUpSamplingTrilinear.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/ATen/native/cuda/caffe2_hip_generated_Activation.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/ATen/native/cuda/caffe2_hip_generated_Distributions.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/ATen/native/cuda/caffe2_hip_generated_EmbeddingBag.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/ATen/native/cuda/caffe2_hip_generated_Gesv.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/ATen/native/cuda/caffe2_hip_generated_SummaryOps.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/ATen/native/cuda/caffe2_hip_generated_TensorCompare.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir//aten/src/ATen/native/sparse/cuda/caffe2_hip_generated_SparseCUDATensor.cu.o'
clang-7.0: error: no such file or directory: 'CMakeFiles/caffe2_hip.dir/
/aten/src/ATen/native/sparse/cuda/caffe2_hip_generated_SparseCUDATensorMath.cu.o'
[100%] Built target caffe2_hip
Install the project...

CMake Error at caffe2/cmake_install.cmake:69 (file):
file INSTALL cannot find
"/home/user/dev/pytorch/build/lib/libcaffe2_hip.so".
Call Stack (most recent call first):
cmake_install.cmake:86 (include)

System Info

Ubuntu 16.04
Kernel 4.13.0-45-generic

build from source inside docker does not work.

πŸ› Bug

build inside docker does not work.

To Reproduce

Steps to reproduce the behavior:
Build from sources section of ./rocm-docs/caffe2-build.md

Dump Gist

error:

caffe2/CMakeFiles/caffe2.dir/build.make:4182: recipe for target 'caffe2/CMakeFiles/caffe2.dir/contrib/aten/aten_op.cc.o' failed
make[2]: *** [caffe2/CMakeFiles/caffe2.dir/contrib/aten/aten_op.cc.o] Error 254

PyTorch Performance Drop for Resnet50 and Resnet101

πŸ› Bug

We are observing consistent performance drops for Resnet50 and Resnet101 with PyTorch on both Vega20 and MI25. MIOpen commit details below.

MIOpen Commit Details
commit 74782da0cf9b1dff8ea6dcfe14e450a3531359d1
Author: Daniel Lowell [email protected]
Date: Mon Dec 17 16:53:33 2018 -0600

Removed redundant else condition.

To Reproduce

Steps to reproduce the behavior:

  1. Load up docker image lcskrishna/rocm-pytorch pfl-1.9.2
  2. Build and install MIOpen in the docker as per the commit details provided above
  3. Run Resnet50 with Batch-size 64
  4. Re-build MIOpen with a commit 1 week old (say Dec 12). Try running the same benchmark again

GPU's observed: MI25, Vega20
ROCm Version: 1.9.307, 1.9.211

[PyTorch] Complete rocRAND Integration.

Currently, rocRAND is only partially integrated with PyTorch. However, it still remains nonfunctional at the time being. This PR is for ensuring that the rocRAND integration successfully achieves 100% of tests passing. This will require coordination with rocRAND lead developers @jszuppe & @ex-rzr.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.