Git Product home page Git Product logo

nanmath.jl's Introduction

NaNMath

deps

Implementations of basic math functions which return NaN instead of throwing a DomainError.

Example:

import NaNMath
NaNMath.log(-100) # NaN
NaNMath.pow(-1.5,2.3) # NaN

In addition this package provides functions that aggregate arrays and ignore elements that are NaN. The following functions are implemented:

sum
maximum
minimum
extrema
mean
median
var
std
min
max

Example:

using NaNMath; nm=NaNMath
nm.sum([1., 2., NaN]) # result: 3.0

nanmath.jl's People

Contributors

andreasnoack avatar anowacki avatar ararslan avatar briochemc avatar danielvandh avatar dependabot[bot] avatar devmotion avatar dilumaluthge avatar femtocleaner[bot] avatar juliatagbot avatar mkborregaard avatar mlubin avatar moelf avatar mzgubic avatar rafaqz avatar ranocha avatar torfjelde avatar ufechner avatar viralbshah avatar ysimillides avatar yuyichao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nanmath.jl's Issues

support dims

support sum(arr, dims =1) like in standard julia sum to apply on a given axis.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

Faster `mean`

julia> function my_mean_count(x::AbstractArray{T}) where T<:AbstractFloat
           z = zero(eltype(x))
           sum = z
           count = 0
           @simd for i in x
               count += ifelse(isnan(i), 0, 1)
               sum += ifelse(isnan(i), z, i)
           end
           result = sum / count
           return (result, count)
       end

this runs 2x faster (on an AMD cpu, so this is expected). Is this a reasonable update(?) to what we already have?

@vectorize_1arg

I just downloaded julia and I'm getting the following warning messages.

WARNING: `@vectorize_1arg` is deprecated in favor of compact broadcast syntax. Instead of `@vectorize_1arg`'ing function `f` and calling `f(arg)`, call `f.(arg)`.
julia> versioninfo()
Julia Version 0.6.0-dev.734
Commit 413ed79 (2016-09-21 08:29 UTC)
Platform Info:
  System: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, sandybridge)

Add more nan functions

As discussed in JuliaLang/julia#12563 (comment) ,
it would be useful to have a Julia equivalent of the following numpy functions:

nanargmax nanargmin nanmax nanmin
nansum nanmean nanstd nanvar

These functions calculate the maximum etc of an array of values, ignoring any NaN's
(instead of promoting them, what Julia will do shortly, probably in the 0.4 release).

Please comment, if this is the right package to add these functions!

Array of Numbers + Mean of Absolute Values

For some personal project, I ended up lifting your mean (Thanks!) and doing the following, for two reasons

  1. support arrays of integers
  2. calculate the mean of absolute values

Maybe there was a better way, but this is what I came up with. For the record.

# array_arithmetic.jl

"""
mean(AbstractArray{T})`

Returns the arithmetic mean of all elements in the array, ignoring NaNs.
"""
function mean(x::AbstractArray{T}) where T<:Number
    return mean_count(x)[1]
end

"""
`mean_abs(AbstractArray{T})`

Returns the arithmetic mean of the absolute values of all elements in the array, ignoring NaNs.
"""
function mean_abs(x::AbstractArray{T}) where T<:Number
    return mean_count(x, absolute = true)[1]
end
"""
Returns a tuple of the arithmetic mean of all elements in the array, ignoring NaNs,
and the number of non-NaN values in the array.
"""
function mean_count(x::AbstractArray{T}; absolute = false) where T<:Number
    z = zero(eltype(x))
    sum = z
    count = 0
    @simd for i in x
        count += ifelse(isnan(i), 0, 1)
        if absolute; i = abs(i); end
        sum += ifelse(isnan(i), z, i)
    end
    result = sum / count
    return (result, count)
end


A = [1 2 3 4 5; 6 7 8 9 10]
mean(A)
mean_abs(A)

mean(convert.(Float64, A))
mean_abs(convert.(Float64, A))

B = [1 -2 3 -4 5; -6 7 -8 9 -10]
mean(B)
mean_abs(B)

proposal: findNaNmax

how do people feel about a PR with

julia> function findNaNmax(x::AbstractArray{T}) where T<:AbstractFloat
           result = convert(eltype(x), NaN)
           indmax = 0
           for (i,v) in enumerate(x)
               if !isnan(v)
                   if (isnan(result) || v > result)
                       result = v
                       indmax = i
                   end
               end
           end
           return (indmax,result)
       end
findNaNmax (generic function with 1 method)

julia> Y1 = rand(10_000_000);

julia> Y3 = ifelse.(rand(length(Y1)) .< 0.9, Y1, NaN);

julia> @btime NaNMath.maximum(Y3);
  21.103 ms (1 allocation: 16 bytes)

julia> @btime findNaNmax(Y3);
  23.513 ms (1 allocation: 32 bytes)

?

or maybe even

function findNaNmax2(x::AbstractArray{T}) where T<:AbstractFloat
	result = convert(eltype(x), NaN)
    indmax = 0
    @inbounds @simd for i in eachindex(x)
    	v = x[i]
        if !isnan(v)
            if (isnan(result) || v > result)
                result = v
                indmax = i
            end
        end
    end
    return (indmax,result)
end

julia> @btime findNaNmax2(Y3);
  22.294 ms (1 allocation: 32 bytes)

Missing values support

I tested out this package on 0.7 with missings and got the following error:

I think this is because Statistics is not imported by NaNMath, so the function calls to NaNMath.mean can't fall base on Statistics.mean.

I think the solution here is to add import Statistics and add fall backs for operations of type Any so those are called when there are missing values. Does that seem feasible?

nm.mean([1.,2., missing])
> ERROR: WARNING: Base.mean is deprecated: it has been moved to the standard library package `Statistics`.
Add `using Statistics` to your imports.
 in module Base
MethodError: no method matching mean(::Array{Union{Missing, Float64},1})
You may have intended to import Base.mean
Closest candidates are:
  mean(::AbstractArray{T<:AbstractFloat,N} where N) where T<:AbstractFloat at /Users/peterdeffebach/.julia/packages/NaNMath/pEda/src/NaNMath.jl:167
Stacktrace:
 [1] top-level scope at none:0

Build error

Reporting as suggested by the error message:

Screen Shot 2019-11-20 at 20 04 22

Incorrectly tagged breaking change

I think that tagging this as a breaking change was a mistake because it is impossible for dropping a Julia version to result in broken code. c.f. https://github.com/SciML/ColPrac#dropping-support-for-earlier-versions-of-julia.

This is a problem because NaNMath has a lot of dependants and they now need to update their compact entries for NaNMath if they wish to receive future updates (e.g. the changes made in v1.0.1).

One possible resolution would be to yank 1.0.0 and 1.0.1 from the registry and tag master as 0.3.8, the next breaking change could then be tagged as either 0.4.0 or 2.0.0.

Another possible resolution would be to update every package that depends on NaNMath to declare compatibility with 1.0.0. Concretely, I am aware of 35 packages that depend on NaN math, 13 of which do not declare compatibility with v1.0.1:

StaticOptim
ClimateTools
Winston
HetaSimulator
HyperDualNumbers
RvSpectMLBase
GraphRecipes
YAAD
EAGO
ReverseDiffSparse
McCormick
LineSearches
RvSpectML

Resolution 1 requires fewer changes (there is no need to change the 22 registered packages that already declared support for NaNMath v1 or the 13 packages that do declare support for v1) and eliminates an erroneous breaking change.

Resolution 2 lets this package keep the visually appealing 1.x version numbers.

cc @mlubin

Functions for arbitrary-dimension arrays

These functions are also often applied to e.g. Matrix{<:AbstractFloat} with NaNs, but currently there are only defined methods for Vectors. Are there plans to make them general?
I guess it could be done with @generated functions using @nloops/@nref?

StatsBase.mean ~= NaNMath.mean

NaNMath.mean and StatsBase.mean do not produce the same result

using NaNMath
using StatsBase
temp = rand(100)
NaNMath.mean(temp) == StatsBase.mean(temp)

This will evaluate to false.
This seems to be true after digit 15:
round(mean(temp),digits=16) == round(NaNMath.mean(temp),digits=16)
Any idea why?

Performance compared to other implementations & supported types

Hi,

I noticed that if I implement my own log function where I take care of the branch-cut by hand, it performs better than the NaNMath.log:

julia> using NaNMath

julia> using BenchmarkTools

julia> function log_nan(x::T)::T where {T<:Real}
           x <= T(0) && return T(NaN)
           return log(x)
       end
log_nan (generic function with 1 method)

julia> X = rand(Float64, 50_000_000) .* 2 .- 1;

julia> @btime NaNMath.log.(X);
  594.144 ms (5 allocations: 381.47 MiB)

julia> @btime log_nan.(X);
  436.285 ms (5 allocations: 381.47 MiB)

this shows that NaNMath.log performs some ~36% slower than log_nan. (This issue was discovered in the discussion here [1])

An implementation like log_nan above also provides support for Float16 and BigFloat, which both gives StackOverflowError error for NaNMath.log:

julia> NaNMath.log(Float16(0.123))
ERROR: StackOverflowError:
Stacktrace:
 [1] log(x::Float16) (repeats 79984 times)
   @ NaNMath ~/.julia/packages/NaNMath/fmhcd/src/NaNMath.jl:10

julia> NaNMath.log(BigFloat(0.123))
ERROR: StackOverflowError:
Stacktrace:
 [1] log(x::BigFloat) (repeats 79984 times)
   @ NaNMath ~/.julia/packages/NaNMath/fmhcd/src/NaNMath.jl:10

I have only tested log, other functions in NaNMath may have similar issues.

[1] MilesCranmer/SymbolicRegression.jl#109

Support median function

It would be useful for the package to support the median function as well. Will give it a go

no methods implemented for integer arrays

Maybe this is a specific case of #17, but I don't understand everything in that issue. All users (even myself!) will understand this issue:

julia> mean([1 2 3])
2.0

julia> NaNMath.mean([1 2 3])
ERROR: MethodError: no method matching mean(::Array{Int64,2})
You may have intended to import Base.mean
Closest candidates are:
  mean(::AbstractArray{T<:AbstractFloat,N} where N) where T<:AbstractFloat at /home/mhu027/.julia/v0.6/NaNMath/src/NaNMath.jl:163

julia> NaNMath.mean([1 2 ฯ€])
2.047197551196598

It would be a useful feature if the NaNMath routines will accept more types like integer.

I am using Julia 0.6.2 and NaNMath 0.3.0.

WARNING: both NaNMath and Base export "log"; uses of it in module Main must be qualified

I'm using the NaNMath package to use the mean and the standard deviation functions but after I write
using NaNMath; nm = NaNMath
I'm first getting a warning that says "WARNING: using NaNMath.log in module Main conflicts with an existing identifier." and then when I try to use the log function (the usual log function from the Base) it gives an error saying log not defined.
How can I fix this issue?
Thanks!

Functions not compatible with all Base-compatible types

We've tried replacing Base.maximum with NaNMath.maximum in Plots, as we very often deal with Vectors with NaNs in them. NaNs are e.g. used as placeholders for breaks in line segments.
However, this causes the package to fail, as sometimes maximum will be called on something that is not a Vector{AbstractFloat}:

ERROR: MethodError: no method matching maximum(::Base.Generator{Array{String,1},Base.#length})
You may have intended to import Base.maximum
Closest candidates are:
  maximum(::Array{T<:AbstractFloat,1}) where T<:AbstractFloat at /Users/michael/.julia/v0.6/NaNMath/src/NaNMath.jl:85

AFAICS we cannot guarantee in advance what type gets passed to maximum. Is there a workaround for this?

NaNMath functions show fallback to the default method

julia> asin(1+1im)
0.6662394324925153 + 1.0612750619050357im

julia> using NaNMath

julia> NaNMath.asin(1+1im)
ERROR: MethodError: no method matching asin(::Complex{Int64})
You may have intended to import Base.asin

Closest candidates are:
  asin(::DualNumbers.Dual)
   @ DualNumbers ~/.julia/packages/DualNumbers/5knFX/src/dual.jl:327
  asin(::PolyForm{<:Number})
   @ SymbolicUtils ~/src/julia/SymbolicUtils/src/methods.jl:87
  asin(::SymbolicUtils.BasicSymbolic{<:Number})
   @ SymbolicUtils ~/src/julia/SymbolicUtils/src/methods.jl:87
  ...

Stacktrace:
 [1] top-level scope
   @ REPL[7]:1

Glib_jll downgrade

I just wanted to point out that Pkg.add("NaNMath") induces a downgrade of Glib_jllfrom v2.68 to v2.59, which was released only 6 days ago. Probably not a big deal, but thought I'd mention it.

a

pow errors out for vector call

This

NaNMath.pow([1.], 3.)

causes a stack overflow. It should probably either work, or throw a method not defined error, right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.