dotnet / torchsharpexamples Goto Github PK

Repository for TorchSharp examples and tutorials.

License: MIT License

C# 26.80% F# 8.82% Jupyter Notebook 64.38%

torchsharpexamples's Introduction

TorchSharp Examples

This repo holds examples and tutorials related to TorchSharp, .NET-only bindings to libtorch, the engine behind PyTorch. If you are trying to familiarize yourself with TorchSharp, rather than contributing to it, this is the place to go.

Currently, the examples are the same that are also found in the TorchSharp repo. Unlike the setup in that repo, where the examples are part of the overall VS solution file and use project references to pick up the TorchSharp dependencies, in this repo, the example solution is using the publically available TorchSharp packages form NuGet.

The examples and tutorials assume that you are on the latest version of TorchSharp, which currently is 0.97.5.

System / Environment Requirements

In order to use TorchSharp, you will need both the most recent TorchSharp package, as well as one of the several libtorch-* packages that are available. The most basic one, which is used in this repository, is the libtorch-cpu package. As the name suggests, it uses a CPU backend to do training and inference.

There is also support for CUDA 11.3 on both Windows and Linux, and each of these combinations has its own NuGet package. If you want to train on CUDA, you need to replace references to libtorch-cpu in the solution and projects.

Note: Starting with NuGet release 0.93.4, we have simplified the package structure, so you only need to select one of these three packages, and it will include the others:

TorchSharp-cpu
TorchSharp-cuda-windows
TorchSharp-cuda-linux

The examples solution should build without any modifications, either with Visual Studio, or using `dotnet build'. All of the examples build on an Nvidia GPU with 8GB of memory, while only a subset build on a GPU with 6GB. Running more than a few epochs while training on a CPU will take a very long time, especially on the CIFAR10 examples. MNIST is the most reasonable example to train on a CPU.

Structure

There are variants of all models in both C# and F#. For C#, there is a 'Models' library, and a 'XXXExamples' console app, which is what is used for batch training of the model. For F#, the models are bundled with the training code (we may restructure this in the future). There is also a utility library that is written in C# only, and used from both C# and F#.

The console apps are, as mentioned, meant to be used for batch training. The command line must specify the model to be used. In the case of MNIST, there are two data sets -- the original 'MNIST' as well as the harder 'Fashion MNIST'.

The repo contains no actual data sets. You have to download them manually and, in some cases, extract the data from archives.

Data Sets

The MNIST model uses either:

Both sets are 28x28 grayscale images, archived in .gz files.

The AlexNet, ResNet*, MobileNet, and VGG* models use the CIFAR10 data set. Instructions on how to download it is available in the CIFAR10 source files.

SequenceToSequence uses the WikiText2 dataset. It's kept in a regular .zip file.

TextClassification uses the AG_NEWS dataset, a CSV file.

Tutorials

We have started work on tutorials, but they are not ready yet. They will mostly be based on .NET Interactive notebooks. If you haven't tried that environment yet, it's worth playing around with it inside VS Code.

Contributing

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

There are two main things we would like help with:

Adding completely new examples. File an issue and assign it to yourself, so we can track it.
Picking up an issue from the 'Issues' list. For example, the examples are currently set up to run on Windows, picking up data from under the 'Downloads' folder. If you have thoughts on the best way to do this on MacOS or Linux, please help with that.

If you add a new example, please adjust it to work on a mainstream CUDA processor. This means making sure that it builds on an 8GB processor, with sufficient invocations of the garbage collector.

A Useful Tip for Contributors

A useful tip from the Tensorflow.NET repo:

After you fork, add dotnet/TorchSharp as 'upstream' to your local repo ...

git remote add upstream https://github.com/dotnet/TorchSharpExamples.git

This makes it easy to keep your fork up to date by regularly pulling and merging from upstream.

Assuming that you do all your development off your main branch, keep your main updated with these commands:

git checkout main
git pull upstream main
git push origin main

Then, you merge onto your dev branch:

git checkout <<your dev branch>>
git merge main

torchsharpexamples's People

Contributors

Stargazers

Watchers

torchsharpexamples's Issues

Any tutorial for object detection?

str function doesn't appear to perform optional argument resolution as described.

Hi 👋

A few tutorials make use of the str method.
It appears a recent change to TorchSharp breaks optional argument resolution.

These two tutorials make use of str
https://github.com/dotnet/TorchSharpExamples/blob/main/tutorials/FSharp/tutorial2.ipynb
https://github.com/dotnet/TorchSharpExamples/blob/main/tutorials/CSharp/tutorial2.ipynb

Here is a minimal example

I've also opened an issue in the TorchSharp repo.
dotnet/TorchSharp#628

Resnet class bug - number of classes

The resnet class https://github.com/dotnet/TorchSharpExamples/blob/main/src/CSharp/Models/ResNet.cs in all the initializers has a "int numClasses," variable, but it is passed on to the main constructer as a hard wired "10" and the numClasses variable isn't actually used.

How to load local TorchSharp from Notebook?

@NiklasGustafsson

D:\project
│   README.md
│
└───libtorch-cpu
│   
└───libtorch-cuda-11.3
│   
└───TorchSharp
│   
└───TorchSharp.Notebooks
│   │   tutorial1.ipynb
│   │   tutorial2.ipynb

Questions

Within e.g. tutorial2.ipynb, how to define i# to load e.g TorchSharp-cpu or TorchSharp-cuda-windows from the local (git clone) TorchSharp folder?

Instead of loading them from the PC default user's nuget folder

#r "nuget: TorchSharp-cpu"
#r "nuget: TorchSharp-cuda-windows"

[Minor] stray output file in repo

https://github.com/dotnet/TorchSharpExamples/tree/f0a1deb6252dc686959d9ddb88f478442a0245ae/tutorials/FSharp/runs/trivial/Oct4_19-46-24_NiklasGWorkstation

looks like accidental output run file from @NiklasGustafsson

[Suggestion] HiddenLayer for .NET notebook

Making HiddenLayer available to .NET notebook [mid to long term vision]?

A lightweight library for neural network graphs and training metrics for ML.NET, TorchSharp and perhaps Tensorflow.NET for .NET notebook

Hosting .NET interactive TorchSharp tutorials on Github pages

This is an exploratory feedback/discussion to democratize torchsharp tutorials/examples through github pages ( most likely via Blazor WebAssembly)

Reference

Discussion towards possible path towards TorchSharp.Native.WebAssembly nuget

Contribute, refactoring suggestions, modernize C#, DataSet, DataLoader

Hi @NiklasGustafsson, I would like to contribute to TensorSharp as we are looking at using it to replace our CNTK usage in our full end-to-end machine learning pipelines written in C#. I have been looking at this example repo in that regard, which is a great starting place. I've mainly work with image models so I've been looking at the CIFAR10 example. I understand that these examples have been created quickly and that they are bare-bones, I would like to improve them. :)

For example, I have a few issues with the Readers like CIFAR10Reader and how they both randomize data by pre-defining randomized batches, which is not normally how you would do this, you create unique random batches for each epoch. Similarly, an epoch would usually (nothing is standardized here and you really can do anything you'd like so this is just IMHO) be defined by iterating our the samples of the dataset once, not by adding transforms after and hence multiplying by that like:

        public IEnumerable<(Tensor, Tensor)> Data()
        {
            for (var i = 0; i < data.Count; i++) {
                yield return (data[i], labels[i]);

                foreach (var tfrm in _transforms) {
                    yield return (tfrm.forward(data[i]), labels[i]);
                }
            }
        }

Also you wouldn't "transform" or augment the data if it is test data, of course you can then just not set the transforms.

Anyway, I was thinking my first contribution could be to refactor the readers to and implement concepts similar to pytorch DataSet and DataLoader. I have worked with this API but am not an expert, nor I am necessarily a fan of the python APIs, but it seems you'd like to have TorchSharp be similar to pytorch, so basing it on that makes sense. Would that be of any interest?

Before doing this I would very much like to migrate this example repo to .NET 6 and C# 10 too and follow standard C# code guidelines and use modern language features, to really make the examples shine with regards to C#. Since performance is my passion I'd also like the examples to at least minimally try to be efficient about what happens, even in cases where it does not matter so much.

Just an example below, I would replace the below with a proper Fischer-Yates shuffle, that is easy to implement.

Enumerable.Range(0, count).OrderBy(c => rnd.Next()).ToArray();

Reproducibility is important too, so all random stuff should be seeded.

Sorry, I am sure you know all this, but I wanted to at least ask whether such changes are of interest first? If you guys agree with them?

To recap I propose:

Migrate to .NET 6 and C# 10
Refactor readers based on a DataSet and DataLoader concept (rough draft)
- Address minor various issues as part of this

And we can take it from there.

[Suggestion] Improving TorchSharp experience in Notebook with Torchsharp.Summary

Is there interest to port

torch-summary to TorchSharp.Summary

from torchsummary import summary

model = ConvNet()
summary(model, (1, 28, 28))

So in TorchSharp notebook, we get:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Conv2d: 1-1                            [-1, 10, 24, 24]          260
├─Conv2d: 1-2                            [-1, 20, 8, 8]            5,020
├─Dropout2d: 1-3                         [-1, 20, 8, 8]            --
├─Linear: 1-4                            [-1, 50]                  16,050
├─Linear: 1-5                            [-1, 10]                  510
==========================================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
==========================================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.05
Params size (MB): 0.08
Estimated Total Size (MB): 0.14
==========================================================================================

Print(style) not working with the recent TorchSharp

https://github.com/dotnet/TorchSharpExamples/blob/main/tutorials/CSharp/tutorial2.ipynb

torch.zeros(4,4).print(style);
torch.ones(4,4).print(style);

(1,24): error CS1503: Argument 2: cannot convert from 'TorchSharp.TensorStringStyle' to 'string'

(2,23): error CS1503: Argument 2: cannot convert from 'TorchSharp.TensorStringStyle' to 'string'

Error: compilation error

Trying Chinese translation

Hello, I am trying to translate this project into Chinese while studying. How can I contribute the results? Or create a new repository?

A need for organizing the tutorial on Question Answer and SQuADv2.0 under torchtext?

The community submitted PR for a tutorial on SQuADv2.0 could be related to squad-2-0 and question-answer

This is part of TorchText. Is there a need to start organizing the TorchSharpExamples around folder structures that align with TorchText?

Loading Python Exported Model into TorchSharp

Originally posted in dotnet/TorchSharp by @jimquittenton:

dotnet/TorchSharp#586

The naming scheme for layers are different in the ResNet example model found in this repo and the ResNet models found in TorchVision, which prevents a model saved from Python from being loaded in TorchSharp using this example code.

Original post:

Hi,
I'm new to TorchSharp and am having trouble loading a python trained ResNet18 model. I've been following this article: https://github.com/dotnet/TorchSharp/blob/main/docfx/articles/saveload.md and have exported my python model using the 'save_state_dict' function in this script: https://github.com/dotnet/TorchSharp/blob/main/src/Python/exportsd.py .

In TorchSharp I have copied the ResNet model from https://github.com/dotnet/TorchSharpExamples/blob/main/src/CSharp/Models/ResNet.cs and then call the following:

int numClasses = 3;
ResNet myModel = ResNet.ResNet18(numClasses);
myModel.to(DeviceType.CPU);
myModel.load(mPath);
The load() line throws an exception with message Mismatched module state names: the target modules does not have a submodule or buffer named 'conv1.weight'.

If I examine the state_dict from 'myModel' prior to load(), it contains entries like:

{[layers.conv2d-first.weight, {TorchSharp.Modules.Parameter}]}
{[layers.bnrm2d-first.weight, {TorchSharp.Modules.Parameter}]}
{[layers.bnrm2d-first.bias, {TorchSharp.Modules.Parameter}]}
{[layers.bnrm2d-first.running_mean, {TorchSharp.torch.Tensor}]}
{[layers.bnrm2d-first.running_var, {TorchSharp.torch.Tensor}]}
{[layers.bnrm2d-first.num_batches_tracked, {TorchSharp.torch.Tensor}]}
{[layers.blck-64-0.layers.blck-64-0-conv2d-1.weight, {TorchSharp.Modules.Parameter}]}
{[layers.blck-64-0.layers.blck-64-0-bnrm2d-1.weight, {TorchSharp.Modules.Parameter}]}
{[layers.blck-64-0.layers.blck-64-0-bnrm2d-1.bias, {TorchSharp.Modules.Parameter}]}
whereas the corresponding entries prior to saving from python are:

conv1.weight torch.Size([64, 3, 7, 7])
bn1.weight torch.Size([64])
bn1.bias torch.Size([64])
bn1.running_mean torch.Size([64])
bn1.running_var torch.Size([64])
bn1.num_batches_tracked torch.Size([])
layer1.0.conv1.weight torch.Size([64, 64, 3, 3])
layer1.0.bn1.weight torch.Size([64])
layer1.0.bn1.bias torch.Size([64])
I tried amending the ResNet.cs code to reflect the python names, but could not get them to exactly match.

I also tried calling load() with strict=false myModel.load(mPath, false);. This seemed to get past the Mismatched names exception, but throws another exception with message Too many bytes in what should have been a 7 bit encoded Int32.

I've been struggling with this for a couple of days now so would really appreciate any help you guys could offer.

Thanks
Jim