pytorch / builder Goto Github PK

View Code? Open in Web Editor NEW

319.0 25.0 208.0 3.71 MB

Continuous builder and binary build scripts for pytorch

License: BSD 2-Clause "Simplified" License

Shell 46.80% Python 25.96% Dockerfile 3.17% Batchfile 12.12% PowerShell 1.14% C++ 10.36% Makefile 0.23% CMake 0.21%

builder's Introduction

pytorch builder

Scripts to build pytorch binaries and do end-to-end integration tests.

Folders:

conda : files to build conda packages of pytorch, torchvision and other dependencies and repos
manywheel : scripts to build linux wheels
wheel : scripts to build OSX wheels
windows : scripts to build Windows wheels
cron : scripts to drive all of the above scripts across multiple configurations together
analytics : scripts to pull wheel download count from our AWS s3 logs

Testing

In order to test build triggered by PyTorch repo's GitHub actions see these instructions

builder's People

Contributors

Stargazers

Watchers

Forkers

eywalker alykhantejani asama-yaya ezyang chsasank ngimel killeent avmgithub zou3519 ssnl csarofeen keon neerajprad kmaehashi k-dominik yf225 peterjc123 ailzhang pjh5 design-innovations degtyareowsa orionr dimplesl afcarl syed-ahmed cpbotha vikhegde wzugang yuhonghong7035 soumyarooproy junjihashimoto hephaex bokhi nikkiccq beauby kingder ryefccd kostmo backyes digitalreasoning xuhdev peterchen666 rezanazari mingbowan pietern xsacha phymucs wanchaol dreiss ivankobzarev lyken17 seemethere anjali411 isuruf malfet jangerritha guyang3532 sailfish009 pibroch1979 jayenashar ruotianluo andfoy samhodge nelson-liu dmytrov yzs981130 rgommers scopatz ptrblck terrorizer1980 jeffdaily walterddr gunandrose4u jeremiahschung sshyran hsuanguo alanhdu janeyx99 ydaiming ulrichfreitag valentinp72 moonryul abonnet kateoscar chmodawk pbelevich civilnet global-localhost global19 global19-atlassian-net xuzhao9 rocm pruthvistony benjaminum driazati bertmaher isabella232 amathews-amd gkowarzyk antoniovs1029

builder's Issues

cudnn with conda

Hi,
I'm trying to use the script to build a pytorch from source (with some mpi lib)
then upload it as anaconda package for easy use.
(I tried doing first it with .whl, however it was >60MB so too big for pypi. so I'm now trying to do it with conda and your script).

I've noticed that instead of using a container form nvidia with cudnn installed, (e.g this)
or similar to the one the main pytorch repo uses for building from source,
you created some custom env by installing cudnn with wget (which will not work if you are not logged on to nvidia, but there are workarounds)
then do cp -P to a some /usr/local/cuda dir.

Why not take a ready container and work from there?
(even cp -P from the /usr/ , if it works ?)
This will make the scripts here more usable for users who are familiar with installing from the normal pytorch repo.

I assume that building from source can not only allow special libraries (like in my case), but also speed up the running time (and allow more specific optimization flags during compilation), so it will be excellent if users will try compiling different ways then share the optimized result via conda.

I currently encounter too much bugs trying to do so myself and do a pytorch nightly build to conda:
here is what I did:

used the Dockerfile from pytorch main repo.
git clone this repo to /opt/builder
(had to change anaconda username variable and play with some channels, and to the aforementioned trick (install tar with wget than cp -P)
then
./build_pytorch.sh 101 nightly 1

any help wold be appreciated :)

Telemetry for all Pytorch repos

Some improvements I'd like to make to https://github.com/pytorch/builder/blob/master/analytics/github_analyze.py

I want to get a raw dump of all emails that contributed to any Pytorch repo and the number of contributions. Once I have that as a csv, it's easy to generate a report using a Pivot table in Excel

So to do so easily I'd like to make the following changes

Make now = datetime.now() configurable so custom a year long ranges can be picked
Add all Pytorch repos by first iterating over all repos in Pytorch org using Github CLI (could be done outside of script so no API authentication is required to get this script to work) gh repo list pytorch | awk '{print $1}'
Add repo_name as a third column in contributor stats
Change the main function to loop over all repos in a repo-paths directory`

In a later PR I'd really like to also add support to count the number of comments people made on issues.

add conda recipe for torchaudio

there are many issues with users having difficulties building torchaudio extension. Maybe it would be good to provide conda builds?

I could create a recipe if that would be helpful.

PyPI Packaging for C++ Extension

I have developed a C++ pytorch extension that I want to deploy on PyPI.

It currently isn't possible to build a manylinux compatible wheel if it depends on pytorch using the semi-official instructions (https://github.com/pypa/manylinux) because pytorch depends on a newer glibc version than manylinux allows (or at least that's the problem I ran into first).

Since the pytorch wheels work well and are not completely manylinux compliant, I think a similar procedure for extensions should be fine. I currently just build on my machine and rename the wheel from linux to manylinux, but then I cant run auditwheel to include the dependencies.

Is there a known procedure for doing this?

Add support of CUDA 11.1 with soumith/manylinux-cuda111 and magma-cuda111

Hello,

To support RTX 3090 and 3080, it can be useful to be able to build PyTorch with CuDNN 8.0.5, CUDA 11.1 and the support of -gencode arch=compute_86,code=sm_86.

Best regards,

magma-cuda102 missing from soumith channel

I'm trying to build a version of pytorch for CUDA 10.2 and I normally use build_multiple.sh but it's not working this time.

I see 10.0 and 10.1: https://anaconda.org/search?q=magma-cuda10

Building for pytorch/pytorch#30532 (comment)

Getting error: conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform linux-64: set(['magma-cuda102'])

PyTorch 1.8.1 on conda is 1.27GB

Considering that on conda, we are linking with the conda-provided toolkit and the conda-provided MKL, it seems like there's a bug somewhere that our pytorch binaries on conda are so large.
please check.

Can I build libtorch Dlls,Libs from Souce by build_libtorch.bat without CAFFE2?

This is Question , not bug report.

I want to build the Dlls and the Libs for the libtorch Ver1.4 from source codes.
I could got libtorch-win-shared-with-deps-1.4.0.zip by build_pytorch.bat at builder/windows.
But I don't need CAFFE2.
Is there way to build libtorch's Dlls and Libs without CAFFE2?

[Environment]
Windows 10 Pro 64 bit build 1909
Microsoft Visual studio17 Community
CPU: Core-i5 4670

[Target]
I wanted to build libtorch's Dlls , Libs & etc for Ver1.4 of cuda10.1 from source codes by myself.

[Procedure]
1.Open Command Promt as Administorator
2.cd /
3.C:> set chcp 65001
4.C:> set git clone https://github.com/pytorch/builder.git
5.C:> set cd builde\windows
6.C:> set BUILD_PYTHONLESS=1
7.C:> set PYTORCH_REPO=pytorch
8.C:> set PYTORCH_BRANCH=v1.4.0
9.C:> build_libtorch.bat 101 1.4.0 0.5.0

After about 8hours,
10. I could get libtorch-win-shared-with-deps-1.4.0.zip in builder/windows/output/cuda101 folder.

If I set "set BUILD_CAFFE2_OPS=0" before procedure 9, can I get libtorch's Dlls&Libs without CAFE2?
If I can delete to build of CAFFE2 , I think that build-time will be very short.

Regards,

BASE_CUDA_VERSION missing in cuda_final stage of Dockerfile

Unlike other stages in the multi-stage Dockerfile (e.g. cpu_final stage), the cuda_final stage's BASE_CUDA_VERSION variable lacks a re-declaration and thus falls out of the ARG's scope, at least in Podman 3.4.x. This results in a failure on this line of the cuda_final build stage because BASE_CUDA_VERSION is not defined.

A minimal example of the scoping issue in Docker 19.03 is demonstrated below, it is based on the pytorch/builder Dockerfile_2014 stages and arguments:

$ docker --version
Docker version 19.03.12, build 48a66213fe

$ cat Dockerfile 
# syntax = docker/dockerfile:experimental
ARG BASE_CUDA_VERSION=10.2
ARG GPU_IMAGE=docker.io/nvidia/cuda:${BASE_CUDA_VERSION}-devel-centos7
FROM docker.io/alpine:3.14 as base
RUN echo "base stage does not use BASE_CUDA_VERSION"

FROM ${GPU_IMAGE} as common
RUN echo "common stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"

FROM common as cpu_final
ARG BASE_CUDA_VERSION=10.2
RUN echo "cpu_final stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"

FROM cpu_final as cuda_final
RUN echo "cuda_final stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"

$ docker build -t test:test -f Dockerfile .
Sending build context to Docker daemon   2.56kB
Step 1/11 : ARG BASE_CUDA_VERSION=10.2
Step 2/11 : ARG GPU_IMAGE=docker.io/nvidia/cuda:${BASE_CUDA_VERSION}-devel-centos7
Step 3/11 : FROM docker.io/alpine:3.14 as base
 ---> 0a97eee8041e
Step 4/11 : RUN echo "base stage does not use BASE_CUDA_VERSION"
 ---> Running in 5ee943513795
base stage does not use BASE_CUDA_VERSION
Removing intermediate container 5ee943513795
 ---> bef23c9a951d
Step 5/11 : FROM ${GPU_IMAGE} as common
 ---> 40f12626d012
Step 6/11 : RUN echo "common stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"
 ---> Running in 21f013b675e8
common stage BASE_CUDA_VERSION 
Removing intermediate container 21f013b675e8
 ---> 5bc92cd1b4dd
Step 7/11 : FROM common as cpu_final
 ---> 5bc92cd1b4dd
Step 8/11 : ARG BASE_CUDA_VERSION=10.2
 ---> Running in 4e876c5a312f
Removing intermediate container 4e876c5a312f
 ---> 82a72f8f9705
Step 9/11 : RUN echo "cpu_final stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"
 ---> Running in 3b324b0812db
cpu_final stage BASE_CUDA_VERSION 10.2
Removing intermediate container 3b324b0812db
 ---> 5da620f3c17a
Step 10/11 : FROM cpu_final as cuda_final
 ---> 5da620f3c17a
Step 11/11 : RUN echo "cuda_final stage BASE_CUDA_VERSION $BASE_CUDA_VERSION"
 ---> Running in 6590e6bebf82
cuda_final stage BASE_CUDA_VERSION 
Removing intermediate container 6590e6bebf82
 ---> e38568c11715
Successfully built e38568c11715
Successfully tagged test:test

Note how BASE_CUDA_VERSION in the echo in cuda_final is not defined

Package in conda-forge?

Any chance to get the conda recipes you have here moved to conda-forge?

That would allow creating conda-forge packages that depend on pytorch 1.0 or greater.

I can make a PR from the recipes you have in conda/ if you want.

Building Libtorch via an IDE on Windows 10: Adding -INCLUDE:?warp_size@cuda@at@@YAHXZ to the linker flags raises unresolved external symbol error.

I built libtorch "libtorch-win-shared-with-deps-debug-1.9.1+cu111" in Visual Studio.

The build runs fine on CPU, but when I try to link against torch_cuda.dll by adding -INCLUDE:?warp_size@cuda@at@@yahxz to the linker flags, I get the error:

LNK2001 unresolved external symbol ""int __cdecl at::cuda::warp_size(void)" (?warp_size@cuda@at@@yahxz)"

I include all *.dll files from the libtorch.lib folder, including torch_cuda.dll, torch_cuda_cpp.dll and torch_cuda_cu.dll. As I said, the build runs as expected on CPU.

Add builder integration tests that test the nightly build so that we can catch errors early

Related to issue pytorch/pytorch#62754

We'd like to investigate how to better test builder PRs when it's only used in nightly builds.

Where are the windows CUDA builds?

I see that windows is only built with CPU only on CI and the nightly crons don't build windows at all. I am curious where the windows release builds come from.

Conda nightly gcc>=5

Is it possbile to have a pytorch-nightly conda package with gcc>=5 or at least with -D_GLIBCXX_USE_CXX11_ABI=1.
Currently the default is 0 and it is creating dual ABI problem when linking with gcc>=5 apps and libraries that don't use -D_GLIBCXX_USE_CXX11_ABI=0 explicitly.

Cleanup nested gotos in Windows build scripts

According to https://stackoverflow.com/questions/8481558/windows-batch-goto-within-if-block-behaves-very-strangely, gotos should be not used in a nested if block. It causes sth weird in https://dev.azure.com/pytorch/PyTorch/_build/results?buildId=18595&view=results.

magma + devtoolset7 bumps binary sizes by unreasonable amounts

In the last base library upgrades, our manylinux nightlies had bumped up in binary size significantly, for what seemed to be no reason at all. Conda were doing fine.

It was really puzzling, and I dug into many dimensions as to why this regression happened.

I concluded the fix to be to compile magma with devtoolset3, whereas the larger pytorch builds were upgraded to devtoolset7.

magma[devtoolset3] + pytorch[devtoolset3] = 709MB
magma[devtoolset7] + pytorch[devtoolset7] = 806MB
magma[devtoolset3] + pytorch[devtoolset7] = ~730MB (final size is 713MB, but that was because I combined it with another fix pytorch/pytorch#23776)

Surprisingly, libmagma_static.a and other static binary sizes in magma didn't change in size, it's just the PyTorch binary that bumped up by 100MB.

This is still puzzling, and needs to be dug into

Pytorch nigihtly not updated

Last nightly in coda-forge was 3 days old.

Eliminate manual step (docker repo creation) for manylinux-cuda workflow for new CUDA

Currently we have a manual step to release cuda, we have to manually create following docker repo and give bots read and write permission to this repo.

This is an example of repo we create:
https://hub.docker.com/r/pytorch/manylinux-cuda115

We want to either automate this step (repo creation)

Or host all manylinux cuda images similar to the following nvidia repo:
https://hub.docker.com/r/nvidia/cuda/tags?page=1 where all images are hosted in the same place.

`torch_stable.html` wheel index is not PEP 503 compliant

the torch_stable.html index of wheels does not match the spec for a "simple index"

https://download.pytorch.org/whl/cu100/torch_stable.html

I propose modifying the html generation script here (https://github.com/pytorch/builder/blob/master/cron/update_s3_htmls.sh) to output a /simple directory with the correct file structure structure is generated in addition to the top level torch_stable.html, linking back to the top level wheel files.

https://www.python.org/dev/peps/pep-0503/
https://packaging.python.org/guides/hosting-your-own-index/

if pip was aware of the valid torch index, builds with local identifiers could be installed without including the html url

Trigger CI builds for pytorch/pytorch and domain libs in pytorch/builder via inter repository dispatch

This is the first step to synchronize and consolidate pytorch repository builds in the builder repo:

pytorch/builder will send a dispatch event to one of pytorch/pytorch, pytorch/vision, pytorch/test, pytorch/audio
Each repository will listen on the dispatch event to trigger their own CI build

Status:

Enabled cross repo triggering of binary builds from pytorch/builder to pytorch/pytorch through CircleCI API:
- Pull request: #874
- Triggering workflow on master branch of pytorcy/pytorch needs to resolve concurrency issue.
- Collect results from pytorch/pytorch and display inside pytorch/builder?
- Should we roll back #860 for binary builds enabled previously as both aim at testing changes in pytorch/builder from the workflow of a dependent repo? Or we carve out binary builds to migrate standalone into pytorch/builder?
Triggering through GHA is blocked at this moment due to limitation of Github API
- Approach 1: reusable workflow (concurrency is not supported in called workflow as in the case of binary build, example GHA run)
- Approach 2: repository dispatch event trigger (only the default branch of the dependent repo can be dispatched onto which means we won’t be able to use a test branch in pytorch/pytorch to test the triggering, see ref)
- Pull request: #877

Parent issue tracker: pytorch/pytorch#66656

add perplexity / convergence checks for word_language_model integration tests

We have integration tests that we run before every release, for end-to-end workflows.

Ref: http://ossci-integration-test-results.s3-website-us-east-1.amazonaws.com/test-results.html
Scripts are located here: https://github.com/pytorch/builder/tree/master/test_community_repos

There are some things to be fixed in these tests.

In this issue, we want to fix the word_language_model example: https://github.com/pytorch/builder/tree/master/test_community_repos/examples/word_language_model

Currently the script checks that the example ran without error.
We want to not just see that it ran successfully, but also that it achieved a minimum particular perplexity.

So, you have to modify the shell scripts in the folder above to parse the validation perplexity out of the stdout / stderr and then check that it's of a minimum particular value.

Add CircleCI SHA1 to builder checkout

To enable checkout of the commit under testing instead of the default branch in .circleci/scripts/binary_checkout.sh

Potential Typo in builder

Should this be MANY_LINUX_VERSION instead of MANY_LINUX_VERSON?

builder/manywheel/build_docker.sh

Line 56 in 3e6551c

DOCKERFILE_SUFFIX=_${MANY_LINUX_VERSON}

run_tests.sh cannot be found

When building a manylinux wheel from the manywheel folder, run_tests.sh cannot be found by build_common.sh

It is necessary to mount the root of the repo into the container (-v "$(dirname "${PWD}")":/remote) and adjust the call to build.sh accordingly.

BLAS library build for AArch64 wheels

There are a number of issues with the current AArch64 whl build in https://github.com/pytorch/builder/blob/master/build_aarch64_wheel.py which appear to be impacting the performance of the finished whl.

OpenBLAS has not been built with USE_OPENMP=1.
The finished PyTorch build is not using a multithreaded BLAS backend as a result. This impacts performance, and results in the following warning (OMP_NUM_THREADS times) for a simple TorchVision ResNet50 inference example OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option.
OpenBLAS is built for a NeoverseN1 target, but with a version of GCC that does not support -mtune=neoverse-n1.
OpenBLAS correctly identifies the t6g (NeoverseN1) platform it is being built on, but GCC only provides support for -mtune=neoverse-n1 from v9 onwards. So the build progresses with -march=armv8.2-a -mtune=cortex-a72 instead. Note: targeting the v8.2 ISA risks generating a binary which is not portable, a "generic" build would need to be provided for portability, although this would impact performance.
The build has USE_EIGEN_FOR_BLAS set.
This can be seen in the output of print(*torch.__config__.show().split("\n"), sep="\n"). As I understand it this should not be required if a BLAS library like OpenBLAS is provided.
-march and -mtune do not appear to have been set for the PyTorch build.
Building with -mcpu=native will chose the appropriate -march and -mtune for the host system (again this will have implications for portability).

Updating build__aarch64_wheel.py so that the OpenBLAS build uses:

LDFLAGS=-lgfortran make TARGET=NEOVERSEN1 USE_OPENMP=1 NO_SHARED=1 -j8

and the PyTorch build uses:

build_vars += f"OpenBLAS_HOME='/opt/OpenBLAS' BLAS='OpenBLAS' USE_MKLDNN=0 USE_OPENMP=1 USE_LAPACK=1 USE_CUDA=0 USE_FBGEMM=0 USE_DISTRIBUTED=0 CXXFLAGS='-mcpu=native -O3'"

Results in:

the disappearance of the OpenBLAS Warning : Detect OpenMP Loop and this application may hang. Please rebuild the library with USE_OPENMP=1 option. warning.
30% speedup in a simple ResNet50 inference example
70% fall in latency for a simple BERT example.

Will it be possible to update the AArch64 build to support: multi-threaded OpenBLAS; disablement of Eigen BLAS; use of correct Neoverse optimisations throughout, as this will ensure the .whl gives better performance, consistent with what you would get if building from source.

manylinux wheel has incorrect hash for METADATA file

The hash in torch-1.10.1.dist-info/RECORD for the file torch-1.10.1.dist-info/METADATA doesn't match that of the actual file and so wheel3 unpack torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl fails with:

Hash mismatch for file 'torch-1.10.1.dist-info/METADATA

I've also tried on 1.9.0 and get the same.

The hash is correct in the OSX wheel. For 1.9.0, the content of the METADATA file is exactly the same in both OSX and manylinux wheels (verified by recomputing the hashes), but the hashes in the record files are different. The size is reported as 6 bytes more in the manylinux RECORD file compared to OSX. For 1.10.1, the diff between OSX and Linux METADATA is:

< License-File: LICENSE
< License-File: NOTICE

AFAICT for the manylinux wheel the hashes are re-computed manually here, as I think several files are added/modified after the initial build with setup.py bdist_wheel. It may be that the METADATA file is modified somehow after it is extracted, a new hash is computed on the modified file, but then the original file is added back in to the wheel. I haven't verified that though, and can't see anything in obvious in the code.

I see the same for torchvision (have tried on 0.10.0 and 0.11.2).

To repro:

#!/bin/bash

OSX_URL="https://files.pythonhosted.org/packages/0f/fe/15c5eb63e5b0d92343469123ecc26579a4bb30df0a2813fe68aca4d36880/torch-1.10.1-cp39-none-macosx_10_9_x86_64.whl"
MANYLINUX_URL="https://files.pythonhosted.org/packages/2c/c8/dcef19018d2fe730ecacf47650d3d6e8d6fe545f02fbdbde0174e0279f02/torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl"
METADATA_FILENAME="torch-1.10.1.dist-info/METADATA"
RECORD_FILENAME="torch-1.10.1.dist-info/RECORD"
 
# OSX_URL="https://files.pythonhosted.org/packages/0c/e9/6ca380d925b3a834f0cb1cac75e4bb53c74a6170bb9b6ec40315501cbdfa/torch-1.9.0-cp39-none-macosx_10_9_x86_64.whl"
# MANYLINUX_URL="https://files.pythonhosted.org/packages/18/dc/364619ec35762f0fda9b1ac5bc73e4372a0e451f15e38ef601d3ef006a17/torch-1.9.0-cp39-cp39-manylinux1_x86_64.whl"
# METADATA_FILENAME="torch-1.9.0.dist-info/METADATA"
# RECORD_FILENAME="torch-1.9.0.dist-info/RECORD"

# OSX_URL="https://files.pythonhosted.org/packages/73/f2/b360ddc86b4b0e10750f1dcf95542c978f6a7893220af245a8e04a0d5664/torchvision-0.11.2-cp39-cp39-macosx_10_9_x86_64.whl"
# MANYLINUX_URL="https://files.pythonhosted.org/packages/92/8b/cbbcf8e055b074ea55ae07d1f3d8fb2888d47f2944ee5ea286243d4d2ba2/torchvision-0.11.2-cp39-cp39-manylinux1_x86_64.whl"
# METADATA_FILENAME="torchvision-0.11.2.dist-info/METADATA"
# RECORD_FILENAME="torchvision-0.11.2.dist-info/RECORD"

# OSX_URL="https://files.pythonhosted.org/packages/76/e6/7be72bbc5fa95a3c5f9690576bed1de45570f5f550a869bb19772f82d4c5/torchvision-0.10.0-cp39-cp39-macosx_10_9_x86_64.whl"
# MANYLINUX_URL="https://files.pythonhosted.org/packages/00/03/88edb6f9f7f17ce264a01209ac550878713a66a99d9d8e25747b15d6aadb/torchvision-0.10.0-cp39-cp39-manylinux1_x86_64.whl"
# METADATA_FILENAME="torchvision-0.10.0.dist-info/METADATA"
# RECORD_FILENAME="torchvision-0.10.0.dist-info/RECORD"


OSX_WHEEL_NAME=$(basename "${OSX_URL}")
MANYLINUX_WHEEL_NAME=$(basename "${MANYLINUX_URL}")

function download_and_extract_wheel() {
  url=$1
  wheel_name=$( basename "${url}" )
  echo "----------------------------------------"
  echo "Fetching ${wheel_name}"

  mkdir -p "${wheel_name}"
  pushd ""${wheel_name}"" > /dev/null
    wget -nc "${url}"
    wheel3 unpack "${wheel_name}"
    unzip -qo "${wheel_name}"
  popd > /dev/null
}


function print_hashes() {
  wheel_name=$1
  echo "----------------------------------------"
  echo "printing hashes"
  echo "${wheel_name}"
  computed_digest=`openssl dgst -sha256 -binary ${wheel_name}/${METADATA_FILENAME}  | openssl base64 | sed -e 's/+/-/g' | sed -e 's/\//_/g' | sed -e 's/=//g'`
  computed_size=`ls -nl ${wheel_name}/${METADATA_FILENAME}  | awk '{print $5}'`
  new_entry="${METADATA_FILENAME},sha256=${computed_digest},${computed_size}"
  echo "RECORD entry:           " $( grep METADATA ${wheel_name}/${RECORD_FILENAME} )
  echo "manually computed entry:" $new_entry
}

function diff_metadata() {
  wheel_name1=$1
  wheel_name2=$2  
  echo "----------------------------------------"
  echo "record diff:"
  diff "${wheel_name1}/${RECORD_FILENAME}" "${wheel_name2}/${RECORD_FILENAME}" | grep METADATA
  
  echo "metadata diff:"
  diff "${wheel_name1}/${METADATA_FILENAME}" "${wheel_name2}/${METADATA_FILENAME}"

}

download_and_extract_wheel "${OSX_URL}"
download_and_extract_wheel "${MANYLINUX_URL}"

print_hashes "${OSX_WHEEL_NAME}"
print_hashes "${MANYLINUX_WHEEL_NAME}"

diff_metadata "${OSX_WHEEL_NAME}" "${MANYLINUX_WHEEL_NAME}"

Output for torch-1.10.1 is:

----------------------------------------
Fetching torch-1.10.1-cp39-none-macosx_10_9_x86_64.whl
File ‘torch-1.10.1-cp39-none-macosx_10_9_x86_64.whl’ already there; not retrieving.

Unpacking to: ./torch-1.10.1...OK
----------------------------------------
Fetching torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl
File ‘torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl’ already there; not retrieving.

Unpacking to: ./torch-1.10.1...Hash mismatch for file 'torch-1.10.1.dist-info/METADATA'
----------------------------------------
printing hashes
torch-1.10.1-cp39-none-macosx_10_9_x86_64.whl
RECORD entry:            torch-1.10.1.dist-info/METADATA,sha256=edF_FwahbS66wyEe6zSnnQ5RuWIN03mwSkwtfeQcqT0,24871
manually computed entry: torch-1.10.1.dist-info/METADATA,sha256=edF_FwahbS66wyEe6zSnnQ5RuWIN03mwSkwtfeQcqT0,24871
----------------------------------------
printing hashes
torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl
RECORD entry:            torch-1.10.1.dist-info/METADATA,sha256=y_NV0MNN6tjiqKPWDm9w7-hVFwP6XDTRVvci-guOPII,24834
manually computed entry: torch-1.10.1.dist-info/METADATA,sha256=wKopP9RszkBTQzmqb6hjYD2o43BlBmy7OdEVxSG1wF4,24828
----------------------------------------
record diff:
< torch-1.10.1.dist-info/METADATA,sha256=edF_FwahbS66wyEe6zSnnQ5RuWIN03mwSkwtfeQcqT0,24871
> torch-1.10.1.dist-info/METADATA,sha256=y_NV0MNN6tjiqKPWDm9w7-hVFwP6XDTRVvci-guOPII,24834
metadata diff:
31,32d30
< License-File: LICENSE
< License-File: NOTICE

And for torch-1.9.0:

----------------------------------------
Fetching torch-1.9.0-cp39-none-macosx_10_9_x86_64.whl
File ‘torch-1.9.0-cp39-none-macosx_10_9_x86_64.whl’ already there; not retrieving.

Unpacking to: ./torch-1.9.0...OK
----------------------------------------
Fetching torch-1.9.0-cp39-cp39-manylinux1_x86_64.whl
File ‘torch-1.9.0-cp39-cp39-manylinux1_x86_64.whl’ already there; not retrieving.

Unpacking to: ./torch-1.9.0...Hash mismatch for file 'torch-1.9.0.dist-info/METADATA'
----------------------------------------
printing hashes
torch-1.9.0-cp39-none-macosx_10_9_x86_64.whl
RECORD entry:            torch-1.9.0.dist-info/METADATA,sha256=dGmSiwyVg6BFdUNA20hi3kq6l9skJniczgd281qjq80,25152
manually computed entry: torch-1.9.0.dist-info/METADATA,sha256=dGmSiwyVg6BFdUNA20hi3kq6l9skJniczgd281qjq80,25152
----------------------------------------
printing hashes
torch-1.9.0-cp39-cp39-manylinux1_x86_64.whl
RECORD entry:            torch-1.9.0.dist-info/METADATA,sha256=hQJhAIZCgeUw4lHR6-TbBb2JlqQFtioPNhDz-mRNzTQ,25158
manually computed entry: torch-1.9.0.dist-info/METADATA,sha256=dGmSiwyVg6BFdUNA20hi3kq6l9skJniczgd281qjq80,25152
----------------------------------------
record diff:
< torch-1.9.0.dist-info/METADATA,sha256=dGmSiwyVg6BFdUNA20hi3kq6l9skJniczgd281qjq80,25152
> torch-1.9.0.dist-info/METADATA,sha256=hQJhAIZCgeUw4lHR6-TbBb2JlqQFtioPNhDz-mRNzTQ,25158
metadata diff:

Install git 2.16+ instead of 1.8.3.1 to avoid some bugs with 1.4.0

I had this issue pytorch/pytorch#35149 when using the docker image of pytorch/builder (pytorch/manylinux-cuda102) when building a manywheel fork of LMS (pytorch/pytorch#35633).

It seems that the issue is only appearing when using git 1.8.3.1. When installing git 2.16 it works just fine:

yum -y remove git*
yum -y install https://centos7.iuscommunity.org/ius-release.rpm
yum -y install git2u-all

Maybe git 2.16 should be installed by default? :)

Replace dropbox and google drive links with s3 links

builder/windows/internal/cuda_install.bat

Lines 174 to 182 in 71a2b9a

 if not exist "%SRC_DIR%\temp_build\NvToolsExt.7z" ( 

 curl -k -L https://www.dropbox.com/s/9mcolalfdj4n979/NvToolsExt.7z?dl=1 --output "%SRC_DIR%\temp_build\NvToolsExt.7z" 

 if errorlevel 1 exit /b 1 

 ) 

 if not exist "%SRC_DIR%\temp_build\gpu_driver_dlls.zip" ( 

 curl -k -L "https://drive.google.com/u/0/uc?id=1injUyo3lnarMgWyRcXqKg4UGnN0ysmuq&export=download" --output "%SRC_DIR%\temp_build\gpu_driver_dlls.zip" 

 if errorlevel 1 exit /b 1 

 )

We should replace these dependencies with S3 assets, @peterjc123 do you know what's hosted on these?

cc @mszhanyi

OpenBLAS OpenMP support AArch64 builds.

Currently, AArch64 builds rely on a single-threaded build of OpenBLAS, see:

builder/build_aarch64_wheel.py

Line 182 in 8e799eb

 host.run_cmd("pushd OpenBLAS; make NO_SHARED=1 -j8; sudo make NO_SHARED=1 install; popd") 

Inclusion of OpenMP is marked as a TODO.

Enabling support should be a matter of adding the USE_OPENMP=1 flat at

builder/build_aarch64_wheel.py

Line 182 in 8e799eb

 host.run_cmd("pushd OpenBLAS; make NO_SHARED=1 -j8; sudo make NO_SHARED=1 install; popd") 

Note: When building PyTorch with an OpenBLAS myself, I also explicitly set OpenBLAS_HOME='/opt/OpenBLAS' BLAS='OpenBLAS' USE_MKLDNN=0 USE_OPENMP=1 USE_LAPACK=1 when running setup.py.

Can OpenMP support be enabled in build_aarch64_wheel.py?
Are there any issues currently blocking this change?

Note: this mirrors an issue raised in #679, however it is unrelated to the issues concerning choices of mcpu, mtune, and march, so I felt it would be beneficial to separate it out and address any specific issues separately.

Nightly with debug symbols

Can we add a nightly build with enabled debug symbols?

Check pytorch/pytorch#12448 (comment)

Replicate binary linux builds of pytorch/pytorch in pytorch/builder

Enable running binary builds of pytorch/pytorch inside pytorch/builder. Resolving this issue will solve two problems:

Run CI in situ for changes made to pytorch/builder vs. the existing way of sending a test PR to pytorch/pytorch which clones changes from pytorch/builder
Run CI ex situ for changes made to pytorch/pytorch (this may seem trivial at this point but it is a preparation step for the consolidation of CI for pytorch repositories)

Create backup tags for binary build images

Now that we are building the binary build images within CI and pushing them within CI we should create an automatic backup tag of the image that we're building that's based on the commit sha and branch reference.

Example:

We build and push pytorch/conda-builder:cpu
We tag pytorch/conda-builder:cpu as both pytorch/conda-builder:cpu-${GIT_BRANCH_NAME} and pytorch/conda-builder:cpu-${GIT_COMMIT_SHA}
Push all tagged images to docker hub

Tags should look somewhat similar to:

pytorch/conda-builder:cpu
pytorch/conda-builder:cpu-master
pytorch/conda-builder:cpu-jkfdl1234j113uy341hj

This should be applied to all three of:

pytorch/conda-builder
pytorch/manylinux-builder
pytorch/libtorch-cxx11-builder

Updating the Documentations

I was following documents to generate libtorch binaries, but the documents in manywheel, cron, etc are too old. I am wondering how to build libtorch1.7.0? Is there any update documentations available?

Update to recent MKL version and Zen performance

This is a follow-up to my comments in #460, but I thought it would be better to make an issue out of this rather adding more comments to a merged pull request ;).

The question in short: MKL_DEBUG_CPU_TYPE cannot be used anymore in recent BLAS versions. Is it still possible to use AVX2-optimized kernels on AMD Zen CPUs?

I did some more investigation. The good news is that apparently Intel is integrating Zen support in MKL. The bad news is that it is that Zen kernels haven't been implemented for every BLAS function, and if no Zen function is available, it switches to the slow SSE kernel.

The following is using MKL 2020.2.254 and a Ryzen 3700X

First, I use the standard ACES DGEMM benchmark:

$ ./mt-dgemm 4000 | grep GF
GFLOP/s rate:         227.809737 GF/s

Hotspot in perf:

65.34%  mt-dgemm  libmkl_def.so       [.] mkl_blas_def_dgemm_kernel_zen

Clearly a Zen-optimized kernel. Then I made an SGEMM version of the same benchmark:

$ ./mt-sgemm 4000 | grep GF
GFLOP/s rate:         151.946679 GF/s

Not so stellar. perf reveals code path using SSE (I checked the instructions used):

74.26%  mt-sgemm  libmkl_def.so     [.] LM_LOOPgas_1

Next, we use LD_PRELOAD to override the function that detects Intel CPUs to always return true:

$ LD_PRELOAD=libfakeintel.so ./mt-sgemm 2000 | grep GF
GFLOP/s rate:         382.358381 GF/s

Much better! Top function in perf:

59.73%  mt-sgemm  libmkl_avx2.so          [.] mkl_blas_avx2_sgemm_kernel_0

tl;dr: MKL seems to be moving towards supporting AMD Zen. However, it seems that Zen kernels haven't been implemented for every BLAS function yet. Possible (hopefully) temporary workaround: put DT_NEEDED item in a library or program's ELF dynamic section to override detection.

Where are Python 3.8 manywheel binaries built for nightlies?

Hi!

Thanks for making this repo public, it's very helpful. I've noticed that PyTorch is pushing pip manywheel nightlies for python 3.8, but it seems like 3.8 is missing from the configuration in cron/ (e.g., 3.8 isn't defined here https://github.com/pytorch/builder/blob/master/cron/build_multiple.sh#L55 , among other places)

Am I missing something, or is the cron configuration to build 3.8 pip manywheels just not pushed yet?

Thanks in advance!

Standardize GHA binary linux build of a domain lib

Scope

Binary linux build

This should solve the problems below:

Create a reusable CI template for domain libs
Currently each domain lib has its own copy of build scripts creating a maintenance problem

Execution strategy:

Pick a candidate domain lib
Extract the build scripts out of the domain lib and test
Prototype the template based off pytorch/pytorch and test

run_test.sh doesnt exist in pytorch-0.4, need to change the below line.

builder/wheel/build_wheel.sh

Line 106 in 0f6af9a

./run_test.sh

manywheel/build_docker.sh failed to execute

Hi there,

It seems that the manywheel/build_docker.sh failed to execute currently. On Github Action, the build shows error https://github.com/pytorch/builder/actions/workflows/build-pytorch-wheels.yml

and on my local workstation, the build shows

 => CACHED [jni 3/3] RUN bash ./install_jni.sh && rm install_jni.sh                                                         0.0s
 => CANCELED [openssl 2/2] RUN bash ./install_openssl.sh && rm install_openssl.sh                                         127.8s
 => CANCELED [cuda 2/2] RUN bash ./install_cuda.sh 11.1 && rm install_cuda.sh                                             135.3s
------
 > [common  4/19] RUN yum swap -y git git224-core:
#13 1.900 Loaded plugins: fastestmirror, ovl
#13 5.687 Loading mirror speeds from cached hostfile
#13 9.584  * base: us.mirror.nsec.pt
#13 9.585  * epel: packages.oit.ncsu.edu
#13 9.598  * extras: mirrors.tripadvisor.com
#13 9.604  * updates: mirrors.tripadvisor.com
#13 99.86 No package git224-core available.
#13 105.5 Error: swap install git224-core
------
executor failed running [/bin/sh -c yum swap -y git git224-core]: exit code: 1
(base) ligeng@Lgs-Mac-mini➜  builder git:(main) GPU_ARCH_TYPE=cuda GPU_ARCH_VERSION=11.1 manywheel/build_docker.sh

Looks like a bug with centos.

Add conda recipe for fairseq

It seems like torchvision's recipe is handled here, so I thought I might add a feature request to also handle fairseq here. Like torchvision, it should be a relatively light lift and I would be happy to help if there's some guidance.

Context here.

Cleanup old Cuda versions from builder project

This Issue is Related to the following issue: Stop CUDA-11.1 binary builds/tests in CI #73377

We've migrated to CUDA-11.3 as default toolkit in 1.9, it's time to stop builds (especially considering forward-compatibility guarantee across CUDA-11.x drivers)

Hence we are removing CUDA 11.1 support. We should also cleanup old cuda related code from our builder and pytorch repo making scripts a little more clean.

We have code that references cuda 9.2 , 10.1 , 11.0, 11.1, 11.2 and none of these are currently used.

Stop using pjh5's credentials to upload to Anaconda

We should setup some role account and use that for uploads.

There will be a regression in PR 950

#950 will trigger an exception at the very beginning of the windows conda build

+ cp -R 'C:\actions-runner\_work\pytorch\pytorch/pytorch' 'C:\actions-runner\_work\pytorch\pytorch/pytorch'
cp: cannot copy a directory, 'C:\actions-runner\_work\pytorch\pytorch/pytorch', into itself, 'C:\actions-runner\_work\pytorch\pytorch/pytorch/pytorch'

https://github.com/pytorch/pytorch/runs/5313907799?check_suite_focus=true#step:10:249

It could be fixed by https://github.com/mszhanyi/builder/blob/bb6d651214c92c4e70a581ed6327fed04cc5f112/conda/build_pytorch.sh#L165

I don't know the exact background of this PR.
Please take a look. @seemethere

pytorch-cpu 1.0.1 on win64 depends on cudatoolkit=None, which can't be satisfied

When I try to install pytorch-cpu 1.0.1 on this Win64 image, I run into the following issue:

C:\Users\cpbotha\work\code\somewhere (master -> origin)
(somewhere) λ conda install -c pytorch pytorch-cpu=1.0.1
Collecting package metadata: done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - pytorch-cpu=1.0.1 -> cudatoolkit=None

It looks like this package depends on the presence of the cudatoolkit package with version "None" which is of course tricky to satisfy.

Add support for CUDA 11.4.x & 11.5.x

Hello. I would like to ask for new CUDA versions.
According to the documentation, adding a new minor version of CUDA is a relatively simple matter.
As CUDA 11.4 has been released some time ago and 11.5 has just been released, I think that adding them may be beneficial.
Thank you in advance! 😀

Build PyTorch >=1.9 with OpenSSL 1.1.1 headers on Linux

To release PyTorch with TLS_TCP Gloo transport which is dynamically loaded during runtime we need to build PT with OpenSSL 1.1.1 headers available during build time. The problem is that some of our Linux CI environments have only OpenSSL 1.0.2 which is not compatible with Gloo TLS_TCP transport.

Also we need some checks for PyTorch Linux release binaries that they were actually built with OpenSSL headers but not linked statically or dynamically with OpenSSL

CMAKE_ARGS no longer used

This is used in conda/build_pytorch.sh and it looks like this is no longer used in PyTorch build.

Target manylinux_2_24 for wheels

The new manylinux_2_24 standard, based on debian9 and glibc2.24, could enable #520 and solve pytorch/pytorch#51039. It would require a new Dockerfile, changing all the yum installation calls to apt, and changing the PATCHELF parts of the build.

Stop hardcoding macos version for libtorch wheel builds

I don't want to have to issue commits like this:

commit c12fbb2d493613c263a0cb189debf06f9504127b (HEAD -> master, origin/master, origin/HEAD)
Author: Edward Z. Yang <[email protected]>
Date:   Wed Aug 14 10:05:01 2019 -0400

    Fix 3.7 wheel build too
    
    Signed-off-by: Edward Z. Yang <[email protected]>

diff --git a/wheel/build_wheel.sh b/wheel/build_wheel.sh
index 0d936c6..cbc1aae 100755
--- a/wheel/build_wheel.sh
+++ b/wheel/build_wheel.sh
@@ -97,7 +97,9 @@ mkdir -p "$whl_tmp_dir"
 # update this!
 # An example of this happened on Aug 13, 2019, when osx-64/python-2.7.16-h97142e2_2.tar.bz2
 # was uploaded to https://anaconda.org/anaconda/python/files
-if [[ "$desired_python" == 3.5 ]]; then
+if [[ "$desired_python" == 3.7 ]]; then
+    mac_version='macosx_10_9_x86_64'
+elif [[ "$desired_python" == 3.5 ]]; then
     mac_version='macosx_10_6_x86_64'
 else
     mac_version='macosx_10_7_x86_64'

building libtorch inside docker

I am wondering if there is any docker image that can be used for building standard libtorch binaries. I am using nvidia/cuda:10.2-cudnn8-devel-centos7 with gcc7.4, but the performance of my build is much slower compared to the prebuilt libtorch.

Also, when using the static links,

export TH_BINARY_BUILD=1
export USE_STATIC_CUDNN=1
export USE_STATIC_NCCL=1
export ATEN_STATIC_CUDA=1
export USE_CUDA_STATIC_LINK=1

, I see further performance degradation. That would be great if you could provide a docker image in order to reproduce the standard binaries.

Thanks

[conda] Should ninja be a runtime dependency?

Right now, if you conda install -c pytorch pytorch, you also pull in ninja as a run-time dependency. I don't think this should be the case though -- Ninja is a build tool, so shouldn't it just be a build-time dependency?

I believe we can fix this by removing ninja from the run dependencies:

builder/conda/pytorch-nightly/meta.yaml

Lines 27 to 31 in bb24ca1

 run: 

 - python 

 - numpy >=1.11 

 - mkl >=2018 

 - ninja

	if not exist "%SRC_DIR%\temp_build\NvToolsExt.7z" (
	curl -k -L https://www.dropbox.com/s/9mcolalfdj4n979/NvToolsExt.7z?dl=1 --output "%SRC_DIR%\temp_build\NvToolsExt.7z"
	if errorlevel 1 exit /b 1
	)

	if not exist "%SRC_DIR%\temp_build\gpu_driver_dlls.zip" (
	curl -k -L "https://drive.google.com/u/0/uc?id=1injUyo3lnarMgWyRcXqKg4UGnN0ysmuq&export=download" --output "%SRC_DIR%\temp_build\gpu_driver_dlls.zip"
	if errorlevel 1 exit /b 1
	)