deepmlnet / deepnet Goto Github PK

Deep.Net machine learning framework for F#

License: Apache License 2.0

F# 96.20% PowerShell 0.29% Cuda 2.63% Batchfile 0.04% Python 0.59% Shell 0.08% C# 0.17% HTML 0.01%

deep-learning symbolic-computation symbolic-execution-engine differentiation fsharp dotnet machine-learning cuda gpu-acceleration gpu

deepnet's Introduction

Deep.Net

Deep learning library for F#. Provides tensor functionality, symbolic model differentiation, automatic differentiation and compilation to CUDA GPUs. It includes optimizers and model blocks used in deep learning.

.NET Standard 2.0 port

Deep.Net is currently being ported to .NET Standard 2.0.

In the process the API is being streamlined and the documentation is being improved. The port of the numeric Tensor library is complete, but the port of the symbolic libraries is still in progress.

Go to the README of the Tensor library.

deepnet's People

Contributors

Stargazers

Watchers

Forkers

wiebke gbaydin leolorenzoluis allisterb superowner ingted erisonliang degerli

deepnet's Issues

Discussion thread on Tensor etc.

Hi @surban et al.

We're looking at doing some work on DiffSharp and am wondering if I can discuss DeepNet.

What are your thoughts about the future of the library?
The Tensor type looks in very good shape. Do you think it would be appropriate to move that to its own repository and allow it to be shared across multiple F# components? We would be interested in using this type in DiffSharp. It seems you have already done the necessary factoring and the Tensor type is quite independent?
Do you have thoughts about the C# tensor type here https://blogs.msdn.microsoft.com/dotnet/2017/11/15/introducing-tensor-for-multi-dimensional-machine-learning-and-ai-data/

I'd be happy to discuss

Thanks
@dsyme

PTXCache IO Exception

...
Model has been built.

Unhandled Exception: System.IO.IOException: The process cannot access the file 'C:\Users\jlanger\AppData\Local\DeepNet\PTXCache\0000000052a0c316\efa4b128-b65e-46e3-865d-dcc795951afc\code.dat' because it is being used by another process.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.File.InternalWriteAllBytes(String path, Byte[] bytes, Boolean checkHost)
at DiskMap.DiskBinaryMap.Set(Byte[] key, Byte[] value) in Z:\DeepPrivate\DeepNet\SymTensorCuda\DiskMap.fs:line 69
at SymTensor.Compiler.Cuda.Compile.ptx$cont@120(String modPath, CudaRuntimeCompiler cmplr, FSharpList1 cmplrArgs, ModCacheKey cacheKey, Unit unitVar) in Z:\DeepPrivate\DeepNet\SymTensorCuda\CudaExec.fs:line 138 at SymTensor.Compiler.Cuda.Compile.loadKernelCode(String modCode, IEnumerable1 krnlNames) in Z:\DeepPrivate\DeepNet\SymTensorCuda\CudaExec.fs:line 120
at SymTensor.Compiler.Cuda.CudaExprWorkspaceTypes.CudaExprWorkspace..ctor(CudaRecipeT recipe) in Z:\DeepPrivate\DeepNet\SymTensorCuda\CudaExec.fs:line 311
at SymTensor.Compiler.Cuda.CudaExprWorkspaceTypes.CudaExprWorkspace.execCalls(IEnumerable1 calls) in Z:\DeepPrivate\DeepNet\SymTensorCuda\CudaExec.fs:line 470 at SymTensor.Compiler.Cuda.CudaEval.cudaEvaluator(CompileEnvT compileEnv, FSharpList1 uexprs) in Z:\DeepPrivate\DeepNet\SymTensorCuda\CudaEval.fs:line 41
at SymTensor.Func.tryCompile$cont@296(FSharpList1 baseExprGens, FSharpSet1 neededVars, CompileEnvT compileEnv, IUExprCompiler compileSpec_0, Unit unitVar) in Z:\DeepPrivate\DeepNet\SymTensor\Function.fs:line 319
at SymTensor.Func.tryCompile@255(FSharpList1 baseExprGens, IUExprCompiler compileSpec_0, CompileEnvT compileEnv, Boolean failIfImpossible) in Z:\DeepPrivate\DeepNet\SymTensor\Function.fs:line 341 at [email protected](FSharpMap2 varEnv) in Z:\DeepPrivate\DeepNet\SymTensor\Function.fs:line 394
at [email protected](FSharpMap2 varEnv) in Z:\DeepPrivate\DeepNet\SymTensor\Function.fs:line 417 at Models.Error.msceMultiple[a,b,c](a batchSize, Dataset1 fullDatasetSub, FSharpFunc2 predFn) at Models.Error.errorDataset[a](Int64 batchSize, TrnValTst1 fullDataset, a predFn, Measures errorMeasure, FSharpFunc2 errorSubset) at Models.Error.error(Measures errorMeasure, Int64 batchSize, TrnValTst1 fullDataset, FSharpFunc`2 predFn)
at Models.Main.main(String[] argv)

Error context is destroyed during shutdown

Unhandled Exception: ManagedCuda.CudaException: ErrorContextIsDestroyed: This error indicates that the context current to the calling thread has been destroyed using ::cuCtxDestroy, or is a primary context which has not yet been initialized.
at ManagedCuda.CudaRegisteredHostMemory`1.Unregister()
at ArrayNDNS.CudaRegMemTypes.f@1-15(CudaRegMemHnd this, Unit unitVar0)
at ArrayNDNS.CudaRegMemTypes.CudaRegMemHnd.System-IDisposable-Dispose()

error message while program closes

The following Error Message should be more understandable!

Error saving checkpoint HDF5 attributes after 100 iterations with custom losses

After 100 Iterations it fails to write the HDF5 attributes.

Losses are recorded every iteration together with CostumLosses.
Checkpoint is saved every 10 iterations with 90 being the last working checkpoint.
Attached the error message.

errorMessage.txt

Loss function result differs from training Loss

Loss evaluated with a separate function:
let lossFn = mi.Func<single> (loss) |> arg2 input target

differs from the loss evaluated during training:
let trainFn () = Train.train trainable fullDataset trainCfg

When thisoption ist set:
SymTensor.Debug.DisableCombineIntoElementsOptimization <- true
the results are consistent again!

Tested via DeepPrivate:
Z:\DeepPrivate\****Models\bin\Release\****Models.exe Z:\DEVELOP\DeepPrivate\****Models\cfgs\COPYING\LSTM\80reluOutputTest\Config.fsx

Code during training gets optimized differently?

Tested with [band 2277e5e]

Is im2col possible?

Is the Tensor type suited to implement an im2col operation? I tried and only succeeded with nested loops—which of course is bad for CUDA.

In the end, I want to arrive at an efficient convolution. Would that be possible with the current API surface?

Cannot apply element-wise operator Add

I examined DeepNet sample code referenced by (http://www.deepml.net/model.html)

However, I executed my code and got System.Exception. Additional information was as follows:

cannot apply element-wise operation Add to unequal shapes ["nHidden"; "nBatch"] and ["nHidden"; "nHidden"]

The exception occurred at let hiddenAct = hiddenWeights .* input.T + hiddenBias
I expect hiddenBias will be [nHidden; nBatch] shapes, but [nHidden; nHidden].

My complete code is as follows:

open Tensor
open Datasets
open SymTensor

[<EntryPoint>]
let main argv = 
    printfn "%A" argv

    let a = HostTensor.init [7L; 5L] (fun [|i; j|] -> 5.0 * float i + float j) 

    /// MINST Dataset
    let mnist = Mnist.load(__SOURCE_DIRECTORY__ + "../../MNIST") 0.0 |> TrnValTst.toHost
    
    printfn "MNIST training set: images have shape %A and labels have shape %A" mnist.Trn.All.Input.Shape mnist.Trn.All.Target.Shape   
    printfn "MNIST test set    : images have shape %A and labels have shape %A" mnist.Tst.All.Input.Shape mnist.Tst.All.Target.Shape

    /// Definition NeuralNetModel
    let mb = ModelBuilder<single> "NeuralNetModel"

    // Definition symbol
    let nBatch  = mb.Size "nBatch"
    let nInput  = mb.Size "nInput"
    let nClass  = mb.Size "nClass"
    let nHidden = mb.Size "nHidden"

    // Model paramaters
    let hiddenWeights = mb.Param ("hiddenWeights", [nHidden; nInput])
    let hiddenBias    = mb.Param ("hiddenBias"   , [nHidden])
    let outputWeights = mb.Param ("outputWeights", [nClass; nHidden])

    // Model variables
    let input  = mb.Var<single> "Input"  [nBatch; nInput]
    let target = mb.Var<single> "Target" [nBatch; nClass]

    // Generating model
    mb.SetSize nInput mnist.Trn.All.Input.Shape.[1]
    mb.SetSize nClass mnist.Trn.All.Target.Shape.[1]
    mb.SetSize nHidden 100L

    let mi = mb.Instantiate DevHost

    // Definition model action in input -> hidden
    let hiddenAct = hiddenWeights .* input.T + hiddenBias // <--------- Exception occurrs!!!
    let hiddenVal = tanh hiddenAct

    // Definition model action in hidden -> output
    let outputAct = outputWeights .* hiddenVal
    let classProb = exp outputAct / Expr.sumKeepingAxis 0 (exp outputAct)

    // Loss function
    let smplLoss = - Expr.sumAxis 0 (target.T * log classProb)
    let loss     = Expr.mean smplLoss

    // Compile
    let lossFn   = mi.Func loss |> arg2 input target

    // Initialization with seed
    mi.InitPars 123

    // test
    let tstLossUntrained = lossFn mnist.Tst.All.Input mnist.Tst.All.Target |> Tensor.value

    printfn "Test loss (untrained): %.4f" tstLossUntrained

    System.Console.ReadKey() |> ignore
    0 // exit code

My environment are as follows:

Windows 7 (64bit)
Visual Studio 2015
Installed DeepNet via Nuget
Installed FSharp.Core(F# 4.1) via Nuget

I'm sorry if I'm misunderstanding about your sophisticated library.
Could you please let me know how to fix this problem?

Always forced IntelMKL usage

I have been trying to use openBLAS for several days and it all fails.

It is always asking me to use mkl and gives error of missing libraries, which cannot be installed on amd cpu.

I even added

  <PropertyGroup>
...
    <DefaultBLAS>OpenBLAS</DefaultBLAS>
  </PropertyGroup>

to my csproj yet it all fails to override IntelMKL usage

Checkpointing interrupted to early

The given 30 sec from the HPC tool before killing the job are not enough to finish saving the checkpoint, leaving a corrupted checkpoint.

Saving checkpoint to ...

Problem with HDF5 error reporting (fixed in upstream, but not in NETStandard fork)

This is actually a report regarding a small part of the HDF.PInvoke.NETStandard package used by DeepNet, but it doesn't have its own issues list.

I hit a problem with the error handling causing heap corruption and process exit when building an error message.

I tracked the problem down to a missing attribute on one of the PInvoke calls: HDFGroup/HDF.PInvoke#154 , and it's been fixed in commit 819d3d2 in the upstream HDF5.PInvoke source and Nuget packages.

Compile Dir not cleaned up if exit is due to ctrl-C

Compile Dir not cleaned up if exit is due to ctrl-C.

Or if cudacontext is destroyed error appears.