Git Product home page Git Product logo

flashweave.jl's People

Contributors

femtocleaner[bot] avatar jtackm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

flashweave.jl's Issues

learn_network defaults

The learn_network doc shows:

help?> learn_network
search: learn_network

  learn_network(data_path::AbstractString, meta_data_path::AbstractString) -> FWResult{<:Integer}

  Works like learn_network(data::AbstractArray{<:Real, 2}), but instead of a data
  matrix takes file paths to an OTU table and optionally a meta data table as an
  input.

    •  data_path - path to a file storing an OTU count matrix (and JLD2 meta
       data)

    •  meta_data_path - optional path to a file with meta data

    •  *_key - HDF5 keys to access data sets with OTU counts, Meta variables and
       variable names in a JLD2 file. If a data item is absent the corresponding
       key should be 'nothing'. See '?load_data' for additional information.

    •  verbose - print progress information

    •  transposed - if true, rows of data are variables and columns are samples

    •  kwargs... - additional keyword arguments passed to
       learn_network(data::AbstractArray{<:Real, 2})

  ────────────────────────────────────────────────────────────────────────────────────

  learn_network(data::AbstractArray{<:Real, 2}) -> FWResult{<:Integer}

  Learn an interaction network from a data matrix (including OTUs and optionally meta
  variables).

    •  data - data matrix with information on OTU counts and (optionally) meta
       variables

    •  header - names of variable columns in data

    •  meta_mask - true/false mask indicating which variables are meta variables

  Algorithmic parameters

    •  heterogeneous - enable heterogeneous mode for multi-habitat or -protocol
       data with at least thousands of samples (FlashWeaveHE)

    •  sensitive - enable fine-grained association prediction (FlashWeave-S,
       FlashWeaveHE-S), sensitive=false results in the fast modes (FlashWeave-F,
       FlashWeaveHE-F)

    •  max_k - maximum size of conditioning sets, high values can lead to the
       removal of more spurious edgens, but may also strongly increase runtime
       and reduce statistical power. max_k=0 results in no conditioning
       (univariate mode)

    •  alpha - statistical significance threshold at which individual edges are
       accepted

    •  conv - convergence threshold, e.g. if conv=0.01 assume convergence if the
       number of edges increased by only 1% after 100% more runtime (checked in
       intervals)

    •  feed_forward - enable feed-forward heuristic

    •  fast_elim - enable fast-elimiation heuristic

    •  max_tests - maximum number of conditional tests that is performed on a
       variable pair before association is assumed

    •  hps - reliability criterion for statistical tests when sensitive=false

    •  FDR - perform False Discovery Rate correction (Benjamini-Hochberg method)
       on pairwise associations

    •  n_obs_min - don't compute associations between variables having less
       reliable samples (non-zero samples if heterogeneous=true) than this
       number. -1: automatically choose a threshold.

    •  time_limit - if feed-forward heuristic is active, determines the interval
       (seconds) at which neighborhood information is updated

  General parameters

    •  normalize - automatically choose and perform data normalization method
       (based on sensitive and heterogeneous)

    •  track_rejections - store for each discarded edge, which variable set lead
       to its exclusion (can be memory intense for large networks)

    •  verbose - print progress information

    •  transposed - if true, rows of data are variables and columns are samples

    •  prec - precision in bits to use for calculations (16, 32, 64 or 128)

    •  make_sparse - use a sparse data representation (should be left at true in
       almost all cases)

    •  make_onehot - create one-hot encodings for meta data variables with more
       than two categories (should be left at true in almost all cases)

    •  update_interval - if verbose=true, determines the interval (seconds) at
       which network stat updates are printed

What are the defaults for these parameters (eg., prec)?

LightGraphs.betweenness_centrality() runs indefinitely when passed a graph with negative values

I found another peculiar bug.

LightGraphs.betweenness_centrality() keeps running until RAM is full, and then crashes when it's Graph object is created from an FlashWeave object but not when it's created from a GML file.

using BenchmarkTools
using FlashWeave
using ParserCombinator
using GraphIO.GML
using LightGraphs

ID                  = 1001
ROOT                = "/home/jonas/Repos/Thesis/data/"
data_path           = "$(ROOT)$(ID)/processed_data/1_otu_table.biom"
gml_no_meta_path    = "$(ROOT)$(ID)/network_no_meta.gml"

netw_results_no_meta = FlashWeave.learn_network(data_path,
                                        sensitive     = true,
                                        heterogeneous = false)

save_network(gml_no_meta_path, netw_results_no_meta)

# {6558, 8484} undirected simple Int64 graph with Float64 weights
G = graph(netw_results_no_meta)

# {6558, 8484} undirected simple Int64 graph
g = loadgraph(gml_no_meta_path, GMLFormat())

@time centrality_arr  = betweenness_centrality(g)
#12.467820 seconds (45.30 M allocations: 7.531 GiB, 2.73% gc time)

@time centrality_arr  = betweenness_centrality(G)
# steadily increasing RAM usage until RAM is full and julia crashes.
pkg> status
Status `/mnt/nvme0n1p4/.julia/environments/v1.5/Project.toml`
  ...
  [2be3f83a] FlashWeave v0.18.0
  ...

Singularity/Docker

Hi,
Do you have any plan to release a Singularity/Docker version of Flashweave?
Thanks

FlashWeave v0.19.0 fails biom_hdf5 test

After adding FlashWeave, ran the test suite pkg> test FlashWeave which produced

Error in 'load_biom'. File data/HMP_SRA_gut/HMP_SRA_gut_tiny_hdf5.biom seems not to be valid .biom

and (after all the stacktrace) the table

Test Summary:  | Pass  Error  Total  Time
table data     |   16      1     17  5.1s
  tsv          |    4             4  1.7s
  tsv_rownames |    4             4  0.1s
  csv          |    4             4  0.0s
  biom_json    |    4             4  1.5s
  biom_hdf5    |           1      1  1.7s

Package version:

status FlashWeave
Status `~/.julia/environments/v1.8/Project.toml`
  [2be3f83a] FlashWeave v0.19.0

Julia and OS info

versioninfo()

Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 1 on 12 virtual cores

The same error is produced by

julia> using FlashWeave

julia> load_data("Builds/FlashWeave.jl/test/data/HMP_SRA_gut/HMP_SRA_gut_tiny_hdf5.biom")

ERROR: Error in 'load_biom'. File Builds/FlashWeave.jl/test/data/HMP_SRA_gut/HMP_SRA_gut_tiny_hdf5.biom seems not to be valid .biom
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] load_biom(data_path::String, meta_path::Nothing)
   @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:229
 [3] load_data(data_path::String, meta_path::Nothing; transposed::Bool, otu_data_key::String, meta_data_key::String, otu_header_key::String, meta_header_key::String)
   @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:46
 [4] load_data (repeats 2 times)
   @ ~/.julia/packages/FlashWeave/sniAe/src/io.jl:29 [inlined]
 [5] top-level scope
   @ REPL[3]:1

caused by: Unexpected character
Line: 0
Around: ...HDF     ...
            ^

Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] _error(message::String, ps::JSON.Parser.MemoryParserState)
    @ JSON.Parser ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:140
  [3] parse_jsconstant(#unused#::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState)
    @ JSON.Parser ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:193
  [4] parse_value(pc::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState)
    @ JSON.Parser ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:170
  [5] parse(str::String; dicttype::Type, inttype::Type{Int64}, allownan::Bool, null::Nothing)
    @ JSON.Parser ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:450
  [6] (::JSON.Parser.var"#4#5"{DataType, DataType, Nothing, Bool, Bool, Int64})(io::IOStream)
    @ JSON.Parser ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:511
  [7] open(f::JSON.Parser.var"#4#5"{DataType, DataType, Nothing, Bool, Bool, Int64}, args::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./io.jl:384
  [8] open
    @ ./io.jl:381 [inlined]
  [9] #parsefile#3
    @ ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:509 [inlined]
 [10] parsefile
    @ ~/.julia/packages/JSON/NeJ9k/src/Parser.jl:502 [inlined]
 [11] load_biom_json(data_path::String)
    @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:195
 [12] load_biom(data_path::String, meta_path::Nothing)
    @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:227
 [13] load_data(data_path::String, meta_path::Nothing; transposed::Bool, otu_data_key::String, meta_data_key::String, otu_header_key::String, meta_header_key::String)
    @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:46
 [14] load_data (repeats 2 times)
    @ ~/.julia/packages/FlashWeave/sniAe/src/io.jl:29 [inlined]
 [15] top-level scope
    @ REPL[3]:1

caused by: MethodError: no method matching read(::Vector{Int64})
Closest candidates are:
  read(::Union{Base.DevNull, Core.CoreSTDERR, Core.CoreSTDOUT}, ::Type{UInt8}) at coreio.jl:23
  read(::Union{HDF5.Attribute, HDF5.Dataset}) at ~/.julia/packages/HDF5/NqGY2/src/readwrite.jl:36
  read(::Union{HDF5.Attribute, HDF5.Dataset}, ::Type{String}, ::Any...) at ~/.julia/packages/HDF5/NqGY2/src/readwrite.jl:61
  ...
Stacktrace:
 [1] load_biom_hdf5(data_path::String)
   @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:212
 [2] load_biom(data_path::String, meta_path::Nothing)
   @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:224
 [3] load_data(data_path::String, meta_path::Nothing; transposed::Bool, otu_data_key::String, meta_data_key::String, otu_header_key::String, meta_header_key::String)
   @ FlashWeave ~/.julia/packages/FlashWeave/sniAe/src/io.jl:46
 [4] load_data (repeats 2 times)
   @ ~/.julia/packages/FlashWeave/sniAe/src/io.jl:29 [inlined]
 [5] top-level scope
   @ REPL[3]:1

Failed to install FlashWeave

I am just starting to learn Julia. I would like to use FlashWeave to analyze my data, but failed to install FlashWeave.
The error is : Unable to automatically install " OpenSpecFun" from C:.julia\packages\OpenSpecFun_j11\G7uzt\Artifacts.tom1
How should I solve this error?
Thanks for your help

Fix hangs in interleaved parallelism (Julia 1.2+)

Interleaved parallel computations with many cores can occasionally get stuck on Julia 1.2 and 1.3, possibly due to internal changes in the implementation of Julia Channels. Likely solved by removing the dependency of StackChannels on these internals.

Interpretation of high amount of edges where weight = 0

Hello,

When generating a network with homogenous and sensitive FlashWeave, the majority of edges have a weight of zero. I am having trouble interpreting this result, as I would think a weight of zero means there should not have been created an edge in the first place. Am I misunderstanding the result? Or can it be due to insufficiencies in the dataset? When generating the network I get several cases of the following error:

Postprocessing..
┌ Warning: Opposite signs for edge 828 <-> 1178 detected. Arbitarily choosing one

which possibly indicates the dataset is not analyzed properly, but I am unable to figure out the exact problem.

I apologize if this is trivial, I can provide more info if needed. Thanks in advance!

Add metadata support for BIOM

Support reading more fields from HDF5-based BIOM, including metadata, as a performant replacement for JLD2. This may turn out to be a separate package, though.

integer as category instead of numeric

Hi,

when I used FlashWeave with metadata, if metadata like "age" is integer then FlashWeave consider it as category and onehot encoded it. How can I indicated him which metadata are numeric and which one are category ?

this is related to this issue #12 but not solved with me.

thank you!

Prepare for Julia 1.0

Most dependencies are 1.0-ready now, I will hopefully manage the update within the next few weeks.

This should allow

  • quicker & easier installation due to Pkg3
  • smoother experience for people using Julia for the first time
  • performance improvements

visualisation

Hello,
I managed to run FlashWeave with 9 samples (and 8 metadata environmental factors) and hundreds of OTUs.
I exported the network in .edgelist as well as .gml format and am trying to visualise it but I can't find a good tutorial anywhere. I tried opening in Cytoscape and .edgelist doesn't work and .gml just gives me what I think is one node (???), a solid rectangle and that's it.
In my FlashWeave output .gml file it says mv 0 or 1 for all nodes (I don't even know what that means?) and different weight edges further down. Can someone please explain how to visualise the FlashWeave network output in Cytoscape or elsewhere? I'm pretty lost. Thanks so much.

---My .gml file looks like this:---
graph [
directed 0
node [
id 1
label "unassigned"
mv 0
]
node [
id 2
label "SAR86 cluster bacterium SAR86B [ref_mOTU_v31_07136]"
mv 0
(...)
]
node [
id 621
label "8"
mv 1
]
edge [
source 3
target 8
weight 0.9157193303108215
]
edge [
source 2
target 11
weight 0.9653017520904541
]

---And my .edgelist file looks like this:---

header unassigned,SAR86 cluster bacterium SAR86B [ref_mOTU_v31_07136],Rhodobacteraceae species incertae sedis [ext_mOTU_v31_15985],alpha proteobacterium HIMB5 [ref_mOTU_v31_11972],Pelagibacteraceae species incertae sedis [meta_mOTU_v31_12642],alpha proteobacterium HIMB59 [ref_mOTU_v31_11871],Candidatus Pelagibacter ubique [ref_mOTU_v31_10655],Rhodobacteraceae bacterium HIMB11 [ref_mOTU_v31_08050],Flavobacteriaceae species incertae sedis [meta_mOTU_v31_13920],Synechococcus sp. [ref_mOTU_v31_01920],Proteobacteria species incertae sedis ext_mOTU_v31_17108

new bug introduced with fix for #21

Hey Janko,

I'm afraid your fix for #21 introduced a bug.

before

(before) pkg> status
Project before v0.1.0
Status `~/before/Project.toml`
  [2be3f83a] FlashWeave v0.18.0

julia> using Distributed

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> @show Distributed.procs()
Distributed.procs() = [1, 2]
2-element Array{Int64,1}:
 1
 2

julia> @everywhere using FlashWeave
[ Info: Precompiling FlashWeave [2be3f83a-7913-5748-9f20-7d448995b934]

julia> ID          = 1001
1001

julia> ROOT        = "/home/jonas/Repos/Thesis/data/"
"/home/jonas/Repos/Thesis/data/"

julia> data_path   = "$(ROOT)$(ID)/processed_data/1_otu_table.biom"
"/home/jonas/Repos/Thesis/data/1001/processed_data/1_otu_table.biom"

julia> netw_results  = FlashWeave.learn_network(data_path,
                                               sensitive     = true,
                                               heterogeneous = false)

### Loading data ###

Inferring network with FlashWeave - sensitive (conditional)

	Run information:
	sensitive - true
	heterogeneous - false
	max_k - 3
	alpha - 0.01
	sparse - true
	workers - 1
	OTUs - 6558
	MVs - 0

### Normalizing ###

Removing variables with 0 variance (or equivalently 1 level) and samples with 0 reads
Discarded 0 samples and 0 variables.

Normalization
┌ Warning: adaptive pseudo-counts for 3 samples were lower than machine precision due to insufficient counts, removing them
└ @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/preprocessing.jl:125

### Learning interactions ###

Setting 'time_limit' to 13.0 s.
Automatically setting 'n_obs_min' to 20 for enhanced reliability.
Computing univariate associations..

Univariate degree stats:
Summary Stats:
Length:         6558
Missing Count:  0
Mean:           114.928637
Minimum:        0.000000
1st Quartile:   37.000000
Median:         124.000000
3rd Quartile:   179.000000
Maximum:        356.000000



Starting conditioning search..

Preparing workers..

Done. Starting inference..
Starting convergence checks at 7733 edges.
Latest convergence step change: 0.83625

Postprocessing..
Complete.

Finished inference. Total time taken: 19.211s

Mode:
FlashWeave - sensitive (conditional)

Network:
8491 interactions between 6558 variables (6558 OTUs and 0 MVs)

Unfinished variables:
none

Rejections:
not tracked


julia>

after

(after) pkg> status
Project after v0.1.0
Status `~/after/Project.toml`
  [2be3f83a] FlashWeave v0.18.0 `https://github.com/meringlab/FlashWeave.jl.git#master`

julia> using Distributed

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> @show Distributed.procs()
Distributed.procs() = [1, 2]
2-element Array{Int64,1}:
 1
 2

julia> @everywhere using FlashWeave

julia> ID          = 1001
1001

julia> ROOT        = "/home/jonas/Repos/Thesis/data/"
"/home/jonas/Repos/Thesis/data/"

julia> data_path   = "$(ROOT)$(ID)/processed_data/1_otu_table.biom"
"/home/jonas/Repos/Thesis/data/1001/processed_data/1_otu_table.biom"

julia> netw_results  = FlashWeave.learn_network(data_path,
                                               sensitive     = true,
                                               heterogeneous = false)

### Loading data ###

### Normalizing ###

Removing variables with 0 variance (or equivalently 1 level) and samples with 0 reads
	-> no samples or variables discarded

Normalization
┌ Warning: adaptive pseudo-counts for 3 samples were lower than machine precision due to insufficient counts, removing them
└ @ FlashWeave ~/.julia/packages/FlashWeave/9pt8o/src/preprocessing.jl:125

### Learning interactions ###

Inferring network with FlashWeave - sensitive (conditional)

	Run information:
	sensitive - true
	heterogeneous - false
	max_k - 3
	alpha - 0.01
	sparse - false
	workers - 1
	OTUs - 6558
	MVs - 0

Setting 'time_limit' to 13.0 s.
Automatically setting 'n_obs_min' to 20 for enhanced reliability
Computing univariate associations
ERROR: On worker 2:
UndefVarError: #55#56 not defined
deserialize_datatype at /opt/julia/usr/share/julia/stdlib/v1.5/Serialization/src/Serialization.jl:1252
handle_deserialize at /opt/julia/usr/share/julia/stdlib/v1.5/Serialization/src/Serialization.jl:826
deserialize at /opt/julia/usr/share/julia/stdlib/v1.5/Serialization/src/Serialization.jl:773
handle_deserialize at /opt/julia/usr/share/julia/stdlib/v1.5/Serialization/src/Serialization.jl:833
deserialize at /opt/julia/usr/share/julia/stdlib/v1.5/Serialization/src/Serialization.jl:773 [inlined]
deserialize_msg at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/messages.jl:99
#invokelatest#1 at ./essentials.jl:710 [inlined]
invokelatest at ./essentials.jl:709 [inlined]
message_handler_loop at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:185
process_tcp_streams at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/process_messages.jl:142
#99 at ./task.jl:356
Stacktrace:
 [1] #remotecall_fetch#143 at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/remotecall.jl:394 [inlined]
 [2] remotecall_fetch(::Function, ::Distributed.Worker) at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/remotecall.jl:386
 [3] remotecall_fetch(::Function, ::Int64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/remotecall.jl:421
 [4] remotecall_fetch at /opt/julia/usr/share/julia/stdlib/v1.5/Distributed/src/remotecall.jl:421 [inlined]
 [5] workers_all_local() at /home/jonas/.julia/packages/FlashWeave/9pt8o/src/misc.jl:96
 [6] prepare_univar_results(::Array{Float32,2}, ::String, ::Float64, ::Int64, ::Int64, ::Bool, ::Array{Int32,1}, ::String, ::Array{Float32,2}, ::Bool, ::Bool, ::String) at /home/jonas/.julia/packages/FlashWeave/9pt8o/src/learning.jl:86
 [7] LGL(::Array{Float32,2}; test_name::String, max_k::Int64, alpha::Float64, hps::Int64, n_obs_min::Int64, max_tests::Int64, convergence_threshold::Float64, FDR::Bool, parallel::String, fast_elim::Bool, no_red_tests::Bool, weight_type::String, edge_rule::String, nonsparse_cond::Bool, verbose::Bool, update_interval::Float64, edge_merge_fun::typeof(FlashWeave.maxweight), tmp_folder::String, debug::Int64, time_limit::Float64, header::Array{String,1}, meta_variable_mask::Nothing, dense_cor::Bool, recursive_pcor::Bool, cache_pcor::Bool, correct_reliable_only::Bool, feed_forward::Bool, track_rejections::Bool, all_univar_nbrs::Nothing, kill_remote_workers::Bool, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/jonas/.julia/packages/FlashWeave/9pt8o/src/learning.jl:229
 [8] macro expansion at ./timing.jl:310 [inlined]
 [9] learn_network(::SparseArrays.SparseMatrixCSC{Int64,Int64}; sensitive::Bool, heterogeneous::Bool, max_k::Int64, alpha::Float64, conv::Float64, header::Array{String,1}, meta_mask::BitArray{1}, feed_forward::Bool, fast_elim::Bool, normalize::Bool, track_rejections::Bool, verbose::Bool, transposed::Bool, prec::Int64, make_sparse::Bool, make_onehot::Bool, max_tests::Int64, hps::Int64, FDR::Bool, n_obs_min::Int64, cache_pcor::Bool, time_limit::Float64, update_interval::Float64, parallel_mode::String, extra_data::Nothing) at /home/jonas/.julia/packages/FlashWeave/9pt8o/src/learning.jl:526
 [10] learn_network(::String, ::Nothing; otu_data_key::String, otu_header_key::String, meta_data_key::String, meta_header_key::String, verbose::Bool, transposed::Bool, kwargs::Base.Iterators.Pairs{Symbol,Bool,Tuple{Symbol,Symbol},NamedTuple{(:sensitive, :heterogeneous),Tuple{Bool,Bool}}}) at /home/jonas/.julia/packages/FlashWeave/9pt8o/src/learning.jl:335
 [11] top-level scope at REPL[10]:1

julia> 

Export ERROR

Thank you for providing this useful tool.
When I try to export the network, save_network throws an error:

julia> save_network("/home/ubuntu/network_output.gml", netw_results, detailed=true)
ERROR: type FWResult has no field rejections
Stacktrace:
 [1] getproperty(::Any, ::Symbol) at ./sysimg.jl:18
 [2] save_rejections(::String, ::FlashWeave.FWResult{Int64}) at /home/ubuntu/.julia/packages/FlashWeave/F6Bpf/src/io.jl:228
 [3] #save_network#14(::Bool, ::Function, ::String, ::FlashWeave.FWResult{Int64}) at /home/ubuntu/.julia/packages/FlashWeave/F6Bpf/src/io.jl:76
 [4] (::getfield(FlashWeave, Symbol("#kw##save_network")))(::NamedTuple{(:detailed,),Tuple{Bool}}, ::typeof(save_network), ::String, ::FlashWeave.FWResult{Int64}) at ./none:0
 [5] top-level scope at none:0

However, the file is created.

graph() function does not open up

Hello,

I'm using FlashWeave with julia-1.5.0. When executing:

data_path = "/home/dmerges/davos_networks/HMP_SRA_gut_small.tsv"
netw_results = learn_network(data_path, sensitive=true, heterogeneous=false)
graph(netw_results)
All I get is: {50, 50} undirected simple Int64 graph with Float64 weights

The same for my own OTu table.

Any ideas are very welcome.

Many thanks and cheers,
Dominik

relationship about parameters heterogeneous=true and make_sparse

Hi,
I have two questions about the parameters of FlashWeave.
Q1: Based on the learn_network() help page, "make_sparse - use a sparse data representation (should be left at true in almost all cases)", sparse should always be set to true. However, it has been observed that even when make_sparse=true, sparse is still false in Run information when heterogeneous=false. On the other hand, when heterogeneous=true, sparse is automatically true. If the sparse parameter is bound to heterogeneous, what is its use for non-heterogeneous data?

Q2: There are only about 100 samples in my OTU matrix (seawater), so I set heterogeneous=false according to the help page (far less than thousands of samples). As a result, the degree distribution of this network approximates the Poisson distribution, similar to the random network. But the degrees of nodes exhibit a power-law distribution when heterogeneous=true. Therefore, I would like to know what are the basic requirements for heterogeneous data.
The following three figures show the degree distribution of the network generated by different parameters/methods.
image
image
image

Can you help me?

metadata is not masked when data is masked due too little datapoints.

Hello. I'm using FlashWeave for my Master Thesis, and I think I ran into a bug.

I discovered it when I was getting myself familiar with your software by practicing on a small dataset Qitta: [ID 1001].

I posted the error message below:

netw_results = FlashWeave.learn_network(data_path, metadata_path,
                                        verbose       = true,
                                        sensitive     = true,  # causes an error if true
                                        heterogeneous = false)

ArgumentError: number of rows of each array must match (got (23, 26))
in top-level scope at Repos/Thesis/src/scripts/1001/1001makegraph.jl:82
in  at FlashWeave/464SQ/src/learning.jl:293
in #learn_network#123 at FlashWeave/464SQ/src/learning.jl:316
in  at FlashWeave/464SQ/src/learning.jl:383
in #learn_network#124 at FlashWeave/464SQ/src/learning.jl:439
in normalize_data##kw at FlashWeave/464SQ/src/preprocessing.jl:515 
in #normalize_data#159 at FlashWeave/464SQ/src/preprocessing.jl:532
in  at FlashWeave/464SQ/src/preprocessing.jl:461
in #preprocess_data_default#158 at FlashWeave/464SQ/src/preprocessing.jl:466
in preprocess_data##kw at FlashWeave/464SQ/src/preprocessing.jl:326 
in #preprocess_data#153 at FlashWeave/464SQ/src/preprocessing.jl:437
in hcat at stdlib/v1.5/SparseArrays/src/sparsevector.jl:1078
in typed_hcat at base/abstractarray.jl:1391 
in _typed_hcat at base/abstractarray.jl:1404

I read your code and I think I understand what is going wrong. adaptive_pseudocount!() masks and removes samples from the matrix. This function is indirectly called in preprocessing_data() (via adaptive_clr!() and clrnorm()). However after this point the masked matrix is merged with the unmasked metadata. Because these matrices are not of the same size at this point an error is thrown.

preemptively dropping metadata columns doesn't work either:

netw_results = FlashWeave.learn_network(data_path, meta_data_with_dropped_cols,
                                        verbose       = true,
                                        sensitive     = true,  # causes an error if true
                                        heterogeneous = false)

AssertionError: observations of data do not fit meta_data: 26 vs. 23
in top-level scope at Repos/Thesis/src/scripts/1001/1001makegraph.jl:82
in  at FlashWeave/464SQ/src/learning.jl:293
in #learn_network#123 at FlashWeave/464SQ/src/learning.jl:303
in  at FlashWeave/464SQ/src/misc.jl:13
in #check_data#46 at FlashWeave/464SQ/src/misc.jl:13

I didn't think of an (easy) fix yet. But If you like I don't mind spending some time on it.

I can provide you with my exact code if you'd like to reproduce the error.

BIOM v2.1 from Qiime2

Hi,
I am trying to analyse microbiome data by using biom data generated by the most recent Qiime2 pipeline (qiime2-2020.8) but I get an error which I could not resolve so far.

julia> netw_results = learn_network(data_path, meta_data_path, sensitive=true, heterogeneous=false)

Loading data

ERROR: MethodError: no method matching zero(::SubString{String})
Closest candidates are:
zero(::Type{Missing}) at missing.jl:103
zero(::Type{Dates.Time}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:406
zero(::Type{Dates.DateTime}) at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Dates\src\types.jl:404
...

Does anybody have an idea what this could mean.
I have attached the biom file created by Qiime2. (original name filtered_table.qza, but renamed to allow upload)

filtered_table_gza.zip

Unknown function

Thanks for developing this tool. I am having some trouble using it.
I tried to replicate the "Basic usage" instruction with test data and getting this:

Loading data

Normalizing

Removing variables with 0 variance (or equivalently 1 level) and samples with 0 reads
-> discarded 0 samples and 5 variables

Normalization
Illegal inttoptr
%42 = ptrtoint double addrspace(13)* %41 to i64
Illegal inttoptr
%57 = inttoptr i64 %56 to i8 addrspace(13)*

signal (6): Abortado
in expression starting at REPL[4]:1
gsignal at /usr/bin/../lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /usr/bin/../lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x7f0426e93d04)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /usr/bin/../lib/x86_64-linux-gnu/libLLVM-8.so.1 (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /usr/bin/../lib/x86_64-linux-gnu/libLLVM-8.so.1 (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /usr/bin/../lib/x86_64-linux-gnu/libLLVM-8.so.1 (unknown line)
unknown function (ip: 0x7f0426f7dac1)
unknown function (ip: 0x7f0426f802d8)
unknown function (ip: 0x7f0426f808cd)
unknown function (ip: 0x7f0426ebbb4a)
unknown function (ip: 0x7f0426eed082)
unknown function (ip: 0x7f0426f1284b)
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/libjulia.so.1 (unknown line)
adaptive_pseudocount! at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:121
adaptive_clr! at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:170
clrnorm at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:289
unknown function (ip: 0x7f03fecdc57e)
#preprocess_data#170 at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:413
preprocess_data##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:373 [inlined]
#preprocess_data_default#175 at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:522
preprocess_data_default##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:517
#normalize_data#178 at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:629
normalize_data##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/preprocessing.jl:612 [inlined]
#learn_network#132 at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/learning.jl:484
learn_network##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/learning.jl:434
#learn_network#130 at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/learning.jl:335
unknown function (ip: 0x7f03fecabce6)
learn_network##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/learning.jl:330
unknown function (ip: 0x7f03fecaba35)
learn_network##kw at /home/pamela/.julia/packages/FlashWeave/rTWlo/src/learning.jl:330
unknown function (ip: 0x7f03fecab885)
unknown function (ip: 0x7f0426f2575b)
unknown function (ip: 0x7f0426f25389)
unknown function (ip: 0x7f0426f258f0)
unknown function (ip: 0x7f0426f269c8)
unknown function (ip: 0x7f0426f27616)
unknown function (ip: 0x7f0426f3fe08)
unknown function (ip: 0x7f0426f403c8)
jl_toplevel_eval_in at /usr/bin/../lib/x86_64-linux-gnu/libjulia.so.1 (unknown line)
eval at ./boot.jl:331
eval_user_input at /build/julia-98cBbp/julia-1.4.1+dfsg/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:86
macro expansion at /build/julia-98cBbp/julia-1.4.1+dfsg/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:118 [inlined]
#26 at ./task.jl:358
unknown function (ip: 0x7f0426f2acbb)
unknown function (ip: (nil))
Allocations: 86638321 (Pool: 86621688; Big: 16633); GC: 85
Abortado (`core' generado)

I have also tried in a jupyter console and after a few seconds I get this message:

Kernel Restarting
The kernel appears to have died. It will restart automatically.

I am working with Ubuntu 20.04 kernel version 5.4.0-67-generic, julia version 1.4.1 and FlashWeave v0.18.0 #master.

insufficient number of observations

Hi Janko,

I've been unable to get anything different from "Automatically setting 'n_obs_min' to 20 for enhanced reliability
ERROR: Dataset has an insufficient number of observations, need at least 20 ('n_obs_min') for reliable tests" when running:

julia> data_path = "flashweave_abundance.csv"
julia> meta_data_path = "flashweave_metadata.csv"
julia> netw_results = learn_network(data_path, meta_data_path, sensitive=true, heterogeneous=false)

I managed to address the "Try using a smaller 'max_k' parameter (at the cost of higher numbers of indirect associations)" message by setting max_k=0, but I'm not sure what I would be losing if I stick to the univariate mode.

I checked the primary recomendations in other issues with:

]up Flashweave
]test FlashWeave

and confirmed that I'm working with the latest version of the tool and that tests passed.

My dataset is small consisting of six samples (rows in both files) with 47 OTUs (columns in data_path) and 17 environmental measurements (columns in meta_data_path, can be integer, float, or categorical).

I tried modifying booleans of 'heterogeneous', 'sensitive', and 'header' with no better outcomes. I also tried reducing n_obs_min down to 6 (number of rows ) and to -1 (for automatical threshold choice) but it returns the same error.

Would you give me a hand on which parameter I could look into to overcome this issue?

Thanks,
constanza

Non Categorical Meta Variables

Is it possible to use FlashWeave also with non-categorical Meta variable like temperature?
If this is the case and I provide a column like the one below, the variable is one-hot encoded.

Meta_Temp
40.000
42.000
43.000
43.000
43.000
36.000
36.000

If I provide it without the decimal point, it is not one-hot encoded.

first test with test set

I have this error when I try FlasWeave. Any idea how to fix this ?

julia> data_path = "HMP_SRA_gut_tiny.tsv"
"HMP_SRA_gut_tiny.tsv"

julia> meta_data_path = "HMP_SRA_gut_tiny_meta.tsv"
"HMP_SRA_gut_tiny_meta.tsv"

julia> netw_results = learn_network(data_path, meta_data_path, sensitive=true, heterogeneous=false)

### Loading data ###

ERROR: MethodError: Cannot `convert` an object of type SubString{String} to an object of type Float64
Closest candidates are:
  convert(::Type{T}, ::T) where T<:Number at number.jl:6
  convert(::Type{T}, ::Number) where T<:Number at number.jl:7
  convert(::Type{T}, ::Base.TwicePrecision) where T<:Number at twiceprecision.jl:250
  ...
Stacktrace:
 [1] setindex!(::Array{Float64,2}, ::SubString{String}, ::Int64) at ./array.jl:826
 [2] copyto!(::Array{Float64,2}, ::Array{Any,2}) at ./multidimensional.jl:962
 [3] Array{Float64,2}(::Array{Any,2}) at ./array.jl:541
 [4] load_dlm(::String, ::String; transposed::Bool, type_data::Bool) at /home/julien/.julia/packages/FlashWeave/HO8wt/src/io.jl:172
 [5] load_data(::String, ::String; transposed::Bool, otu_data_key::String, meta_data_key::String, otu_header_key::String, meta_header_key::String) at /home/julien/.julia/packages/FlashWeave/HO8wt/src/io.jl:38
 [6] learn_network(::String, ::String; otu_data_key::String, otu_header_key::String, meta_data_key::String, meta_header_key::String, verbose::Bool, transposed::Bool, kwargs::Base.Iterators.Pairs{Symbol,Bool,Tuple{Symbol,Symbol},NamedTuple{(:sensitive, :heterogeneous),Tuple{Bool,Bool}}}) at /home/julien/.julia/packages/FlashWeave/HO8wt/src/learning.jl:297
 [7] top-level scope at REPL[4]:1


Interpreting track_rejections output

I am wanting to explore the discarded edges from my network to determine which ones are the result of environmental variables and included the generations of a track_rejections file when running flashweave (see head of file below). I've gathered that the Edge column indicates the ID of the two nodes connected by the rejected edge, but am looking for clarification on the values in the Rejecting_set column. Is it correct to assume the numbers there also representative of node IDs? Any clarifications appreciated!

Edge Rejecting_set Stat P_value Num_tests Perc_tested Df SuffPower
4986 <-> 4657 1400,3067,1106 0.0 1.0 1 0.14286 0 true
4986 <-> 4805 1400,3067,1106 0.0 1.0 1 0.14286 0 true
4986 <-> 3626 1400,3067,1106 0.0 1.0 1 0.14286 0 true
4986 <-> 4636 1400,3067,1106 0.0 1.0 1 0.14286 0 true
4986 <-> 2802 1400,3067,1106 0.0 1.0 1 0.14286 0 true

How to extract the environmentally driven edges identified by FlashWeave?

Hi,
I'm stuck with the following two issues.
Q1: Which edges in the FlashWeave results are environmentally driven? It appears that there is no clear indication in the learn_network() help or results.
We can get the directed network by setting max_k=3, or univariate network (including direct and indirect edges) by setting max_k=0. In my opinion, the edges that do not appear in the direct network should include both spurious co-occurrence between microbes and environmentally-driven associations. However, I could not find any information on how to distinguish between these two parts in corresponding article and method documentation. Is it possible for FlashWeave to extract environmentally-driven edges?

Q2: FlashWeave runs normally when 'detailed' is not set, but I get the following error when setting 'detailed=true'.
`julia> netw_results = learn_network(data_path, meta_path, sensitive=true, heterogeneous=false, alpha=0.001,max_k=3,detailed=true)

Can you help me? Thanks a lot.

Loading data

Normalizing

Removing variables with 0 variance (or equivalently 1 level) and samples with 0 reads
-> no samples or variables discarded

Normalization

Removing meta variables with 0 variance (or equivalently 1 level)
-> no samples or variables discarded

Learning interactions

Inferring network with FlashWeave - sensitive (conditional)

Run information:
sensitive - true
heterogeneous - false
max_k - 3
alpha - 0.001
sparse - false
workers - 1
OTUs - 42344
MVs - 18

Automatically setting 'n_obs_min' to 20 for enhanced reliability
Computing univariate associations

Univariate degree stats:
Summary Stats:
Length: 42362
Missing Count: 0
Mean: 738.558425
Minimum: 3.000000
1st Quartile: 382.000000
Median: 586.000000
3rd Quartile: 855.000000
Maximum: 4754.000000

Starting conditioning search

Preparing workers..

Done. Starting inference..

Exception occurred on worker 1:
MethodError: no method matching check_candidate!(::Int64, ::Int64, ::Matrix{Float32}, ::Vector{Int64}, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, ::FlashWeave.FzTestCond{Float32, FlashWeave.NoNz}, ::Int64, ::Float64, ::Int64, ::Int64, ::Int64, ::Int64, ::Dict{Int64, Tuple{Tuple{Int64, Vararg{Int64}}, FlashWeave.TestResult, Tuple{Int64, Float64}}}, ::Bool, ::Vector{Int32}, ::Char, ::Bool, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}; detailed=true)
Closest candidates are:
check_candidate!(::Int64, ::Int64, ::AbstractMatrix{ElType}, ::Vector{Int64}, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, ::TestType, ::Integer, ::AbstractFloat, ::Integer, ::Integer, ::Integer, ::Integer, ::Dict{Int64, Tuple{Tuple{Int64, Vararg{Int64}}, FlashWeave.TestResult, Tuple{Int64, Float64}}}, ::Bool, ::Vector{DiscType}, ::Char, ::Bool, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}; bnb, cut_test_branches) where {ElType<:Real, DiscType<:Integer, TestType<:FlashWeave.AbstractTest} at ~/.julia/packages/FlashWeave/UprmH/src/hiton.jl:80 got unsupported keyword argument "detailed"
Stacktrace:
[1] kwerr(::NamedTuple{(:detailed,), Tuple{Bool}}, ::Function, ::Int64, ::Int64, ::Matrix{Float32}, ::Vector{Int64}, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, ::FlashWeave.FzTestCond{Float32, FlashWeave.NoNz}, ::Int64, ::Float64, ::Int64, ::Int64, ::Int64, ::Int64, ::Dict{Int64, Tuple{Tuple{Int64, Vararg{Int64}}, FlashWeave.TestResult, Tuple{Int64, Float64}}}, ::Bool, ::Vector{Int32}, ::Char, ::Bool, ::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}})
@ Base ./error.jl:165
[2] hiton_backend(T::Int64, candidates::Vector{Int64}, data::Matrix{Float32}, test_obj::FlashWeave.FzTestCond{Float32, FlashWeave.NoNz}, max_k::Int64, alpha::Float64, hps::Int64, n_obs_min::Int64, max_tests::Int64, prev_accepted_dict::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, candidates_unchecked::Vector{Int64}, time_limit::Float64, start_time::Float64, debug::Int64, whitelist::Set{Int64}, blacklist::Set{Int64}, rej_dict::Dict{Int64, Tuple{Tuple{Int64, Vararg{Int64}}, FlashWeave.TestResult, Tuple{Int64, Float64}}}, track_rejections::Bool, z::Vector{Int32}, phase::Char; fast_elim::Bool, no_red_tests::Bool, support_dict::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:detailed,), Tuple{Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/hiton.jl:138
[3] interleaving_phase(::Int64, ::Vararg{Any}; add_initial_candidate::Bool, univar_nbrs::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:detailed,), Tuple{Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/hiton.jl:155
[4] si_HITON_PC(T::Int64, data::Matrix{Float32}, levels::Vector{Int32}, max_vals::Vector{Int32}, cor_mat::Matrix{Float32}; test_name::String, max_k::Int64, alpha::Float64, hps::Int64, n_obs_min::Int64, max_tests::Int64, fast_elim::Bool, no_red_tests::Bool, FDR::Bool, weight_type::String, whitelist::Set{Int64}, blacklist::Set{Int64}, univar_nbrs::OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}, prev_state::FlashWeave.HitonState{Int64}, debug::Int64, time_limit::Float64, track_rejections::Bool, cache_pcor::Bool, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:detailed,), Tuple{Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/hiton.jl:338
[5] interleaved_worker(data::Matrix{Float32}, levels::Vector{Int32}, max_vals::Vector{Int32}, cor_mat::Matrix{Float32}, edge_rule::String, nonsparse_cond::Bool, shared_job_q::Distributed.RemoteChannel{Channel{Tuple}}, shared_result_q::Distributed.RemoteChannel{Channel{Tuple}}, GLL_args::Dict{Symbol, Any})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/interleaved.jl:30
[6] (::FlashWeave.var"#122#126"{String, Bool, Matrix{Float32}, Vector{Int32}, Vector{Int32}, Matrix{Float32}, Dict{Symbol, Any}, Distributed.RemoteChannel{Channel{Tuple}}, Distributed.RemoteChannel{Channel{Tuple}}})()
@ FlashWeave /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Distributed/src/macros.jl:83
[7] #invokelatest#2
@ ./essentials.jl:729 [inlined]
[8] invokelatest
@ ./essentials.jl:726 [inlined]
[9] #153
@ /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Distributed/src/remotecall.jl:425 [inlined]
[10] run_work_thunk(thunk::Distributed.var"#153#154"{FlashWeave.var"#122#126"{String, Bool, Matrix{Float32}, Vector{Int32}, Vector{Int32}, Matrix{Float32}, Dict{Symbol, Any}, Distributed.RemoteChannel{Channel{Tuple}}, Distributed.RemoteChannel{Channel{Tuple}}}, Tuple{}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, print_error::Bool)
@ Distributed /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:70
[11] run_work_thunk(rv::Distributed.RemoteValue, thunk::Function)
@ Distributed /Applications/Julia-1.8.app/Contents/Resources/julia/share/julia/stdlib/v1.8/Distributed/src/process_messages.jl:79
[12] (::Distributed.var"#100#102"{Distributed.RemoteValue, Distributed.var"#153#154"{FlashWeave.var"#122#126"{String, Bool, Matrix{Float32}, Vector{Int32}, Vector{Int32}, Matrix{Float32}, Dict{Symbol, Any}, Distributed.RemoteChannel{Channel{Tuple}}, Distributed.RemoteChannel{Channel{Tuple}}}, Tuple{}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}})()
@ Distributed ./task.jl:484

ERROR: "Interleaved error (see stacktrace above)"
Stacktrace:
[1] interleaved_backend(target_vars::Vector{Int64}, data::Matrix{Float32}, all_univar_nbrs::Dict{Int64, OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}}, levels::Vector{Int32}, max_vals::Vector{Int32}, cor_mat::Matrix{Float32}, GLL_args::Dict{Symbol, Any}; update_interval::Float64, variable_ids::Vector{String}, meta_variable_mask::Nothing, convergence_threshold::Float64, conv_check_start::Float64, conv_time_step::Float64, parallel::String, edge_rule::String, edge_merge_fun::typeof(FlashWeave.maxweight), nonsparse_cond::Bool, verbose::Bool, workers_local::Bool, feed_forward::Bool, kill_remote_workers::Bool)
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/interleaved.jl:161
[2] infer_conditional_neighbors(target_vars::Vector{Int64}, data::Matrix{Float32}, all_univar_nbrs::Dict{Int64, OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}}, levels::Vector{Int32}, max_vals::Vector{Int32}, cor_mat::Matrix{Float32}, parallel::String, nonsparse_cond::Bool, recursive_pcor::Bool, cont_type::DataType, verbose::Bool, hiton_kwargs::Dict{Symbol, Any}, interleaved_kwargs::Dict{Symbol, Any})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/learning.jl:148
[3] learn_graph_structure(target_vars::Vector{Int64}, data::Matrix{Float32}, all_univar_nbrs::Dict{Int64, OrderedCollections.OrderedDict{Int64, Tuple{Float64, Float64}}}, levels::Vector{Int32}, max_vals::Vector{Int32}, cor_mat::Matrix{Float32}, parallel::String, recursive_pcor::Bool, cont_type::DataType, time_limit::Float64, nonsparse_cond::Bool, verbose::Bool, track_rejections::Bool, hiton_kwargs::Dict{Symbol, Any}, interleaved_kwargs::Dict{Symbol, Any})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/learning.jl:176
[4] LGL(data::Matrix{Float32}; test_name::String, max_k::Int64, alpha::Float64, hps::Int64, n_obs_min::Int64, max_tests::Int64, convergence_threshold::Float64, FDR::Bool, parallel::String, fast_elim::Bool, no_red_tests::Bool, weight_type::String, edge_rule::String, nonsparse_cond::Bool, verbose::Bool, update_interval::Float64, edge_merge_fun::typeof(FlashWeave.maxweight), tmp_folder::String, debug::Int64, time_limit::Float64, header::Vector{String}, meta_variable_mask::Nothing, dense_cor::Bool, recursive_pcor::Bool, cache_pcor::Bool, correct_reliable_only::Bool, feed_forward::Bool, track_rejections::Bool, all_univar_nbrs::Nothing, kill_remote_workers::Bool, workers_local::Bool, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:detailed,), Tuple{Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/learning.jl:255
[5] macro expansion
@ ./timing.jl:463 [inlined]
[6] learn_network(data::Matrix{Any}; sensitive::Bool, heterogeneous::Bool, max_k::Int64, alpha::Float64, conv::Float64, header::Vector{String}, meta_mask::BitVector, feed_forward::Bool, fast_elim::Bool, normalize::Bool, track_rejections::Bool, verbose::Bool, transposed::Bool, prec::Int64, make_sparse::Bool, make_onehot::Bool, max_tests::Int64, hps::Int64, FDR::Bool, n_obs_min::Int64, cache_pcor::Bool, time_limit::Float64, update_interval::Float64, parallel_mode::String, extra_data::Nothing, share_data::Bool, experimental_kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:detailed,), Tuple{Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/learning.jl:559
[7] learn_network(data_path::String, meta_data_path::String; otu_data_key::String, otu_header_key::String, meta_data_key::String, meta_header_key::String, verbose::Bool, transposed::Bool, kwargs::Base.Pairs{Symbol, Real, NTuple{5, Symbol}, NamedTuple{(:sensitive, :heterogeneous, :alpha, :max_k, :detailed), Tuple{Bool, Bool, Float64, Int64, Bool}}})
@ FlashWeave ~/.julia/packages/FlashWeave/UprmH/src/learning.jl:347
[8] top-level scope
@ REPL[39]:1

julia>`

ArgumentError: number of rows of each array must match (got (46, 47))

Hi,

Thank you for developing this tool.

I ran into this error during normalization steps:

┌ Warning: 1 factor variable with more than two categories were detected (cul_type), splitting it into separate dummy variables (One Hot)
└ @ FlashWeave C:\Users\Xiaoping.julia\packages\FlashWeave\16aV0\src\preprocessing.jl:63
┌ Warning: adaptive pseudo-counts for 1 samples were lower than machine precision due to insufficient counts, removing them
└ @ FlashWeave C:\Users\Xiaoping.julia\packages\FlashWeave\16aV0\src\preprocessing.jl:125

ArgumentError: number of rows of each array must match (got (46, 47)).

Just wondering if the program should have taken care of it internally?

my code for running: learn_network(otu, meta, sensitive=true, heterogeneous=false, normalize=true, alpha=0.05, FDR=true)

Thanks

insufficient number of observations error

Hello,
I'm testing FlashWeave with a small subsection of my data, with two samples with over 400 OTUs, plus 8 metadata variables. I keep getting the error "ERROR: Dataset has an insufficient number of observations, need at least 20 ('n_obs_min') for reliable tests. Try using a smaller 'max_k' parameter (at the cost of higher numbers of indirect associations)."
My Run information is:
sensitive - true
heterogeneous - false
max_k - 3
alpha - 0.01
sparse - false
workers - 1
OTUs - 2
MVs - 8
So it says I have 2 OTUs instead of 2 samples and hundreds of OTUs. I tried transposed=true, I tried manually transposing my data, plus adjusting the max_k and n_obs_min it suggests changing, but I keep getting the same error.
I generated the relative OTU counts table with mOTUs2.
Can someone help me please? Thanks so much.
E

Using metadata with normalize = false fails

Hi!
I'm trying to run a custom CLR normalization on my data and then trying to use FlashWeave to infer a network. It works, except that when I try to add the metadata it fails. I was able to reproduce the issue using the test data in the repo:

using FlashWeave
data_global="test/data/HMP_SRA_gut/HMP_SRA_gut_tiny.tsv"
meta_path = "test/data/HMP_SRA_gut/HMP_SRA_gut_tiny_meta.tsv"

net_res_global = learn_network(data_global, meta_path, n_obs_min=3,normalize=true) #works
net_res_global = learn_network(data_global, meta_path, n_obs_min=3,normalize=false) #fails

Any ideas on the issue?

metadata mapping by sample name

Although not explicit in the README, it appears that the metadata table must have the same sample order as the sample x feature table. For the sake of users who want to make sure that the correct samples are mapped to the correct metadata, it would be helpful to allow users to include a Sample column in the metadata table that will map the metadata to the correct sample in the sample x feature table.

MethodError: no method matching nv(::SimpleWeightedGraphs.SimpleWeightedGraph{Int64, Float64})

Hi all,

I'm having an error while testing FlashWeave 0.18 on julia 1.6.2. I downloaded HMP_SRA_gut_small.tsv and simply run

julia> using FlashWeave
julia> data_path = "HMP_SRA_gut_small.tsv"
julia> net_learn = learn_network(data_path)

It finished the inference, but got a problem getting the network. Any ideas of what might be going on?

Best,
Marko

PS. The full Stacktrace:

Network:
Error showing value of type FlashWeave.FWResult{Int64}:
ERROR: MethodError: no method matching nv(::SimpleWeightedGraphs.SimpleWeightedGraph{Int64, Float64})
Closest candidates are:
  nv(::LightGraphs.SimpleGraphs.AbstractSimpleGraph{T}) where T at /root/.julia/packages/LightGraphs/IgJif/src/SimpleGraphs/SimpleGraphs.jl:57
  nv(::LightGraphs.AbstractGraph) at /root/.julia/packages/LightGraphs/IgJif/src/interface.jl:141
Stacktrace:
  [1] show(io::IOContext{Base.TTY}, result::FlashWeave.FWResult{Int64})
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/types.jl:253
  [2] show(io::IOContext{Base.TTY}, #unused#::MIME{Symbol("text/plain")}, x::FlashWeave.FWResult{Int64})
    @ Base.Multimedia ./multimedia.jl:47
  [3] (::REPL.var"#38#39"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:220
  [4] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:462
  [5] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:213
  [6] display(d::REPL.REPLDisplay, x::Any)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:225
  [7] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:328
  [8] #invokelatest#2
    @ ./essentials.jl:708 [inlined]
  [9] invokelatest
    @ ./essentials.jl:706 [inlined]
 [10] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:247
 [11] (::REPL.var"#40#41"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:231
 [12] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:462
 [13] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:229
 [14] (::REPL.var"#do_respond#61"{Bool, Bool, REPL.var"#72#82"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:798
 [15] #invokelatest#2
    @ ./essentials.jl:708 [inlined]
 [16] invokelatest
    @ ./essentials.jl:706 [inlined]
 [17] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/LineEdit.jl:2441
 [18] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/REPL/src/REPL.jl:1126
 [19] (::REPL.var"#44#49"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ./task.jl:411

Use of relative abundances

I have a table generated by the mOTUs program with relative abundances. Can I use Flashweave or FlashweaveHE with relative abundances (range 0 to 1)?

KeyError: key FlashWeave [2be3f83a-7913-5748-9f20-7d448995b934] not found

If I run the example code in a script on the HMP_SRA_gut_small.tsv dataset (with a couple of random test metadata columns: HMP_SRA_gut_small_meta.tsv), I get the following error when running learn_network() with >1 proc set via addprocs():

### Loading data ###

Inferring network with FlashWeave - sensitive (conditional)

	Run information:
	sensitive - true
	heterogeneous - false
	max_k - 3
	alpha - 0.01
	sparse - false
	workers - 4
	OTUs - 50
	MVs - 2

### Normalizing ###

Removing variables with 0 variance (or equivalently 1 level) and samples with 0 reads
Discarded 5 samples and 0 variables.

Normalization

### Learning interactions ###

Setting 'time_limit' to 6.0 s.
Automatically setting 'n_obs_min' to 20 for enhanced reliability.
Computing univariate associations..
ERROR: LoadError: On worker 2:
KeyError: key FlashWeave [2be3f83a-7913-5748-9f20-7d448995b934] not found
Stacktrace:
  [1] getindex
    @ ./dict.jl:482 [inlined]
  [2] root_module
    @ ./loading.jl:957 [inlined]
  [3] deserialize_module
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:962
  [4] handle_deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:864
  [5] deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782
  [6] deserialize_datatype
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:1287
  [7] handle_deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:835
  [8] deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782
  [9] handle_deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:842
 [10] deserialize
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:782 [inlined]
 [11] deserialize_msg
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:87
 [12] #invokelatest#2
    @ ./essentials.jl:708 [inlined]
 [13] invokelatest
    @ ./essentials.jl:706 [inlined]
 [14] message_handler_loop
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:169
 [15] process_tcp_streams
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:126
 [16] #99
    @ ./task.jl:411
Stacktrace:
  [1] #remotecall_fetch#143
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:394 [inlined]
  [2] remotecall_fetch(::Function, ::Distributed.Worker)
    @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:386
  [3] remotecall_fetch(::Function, ::Int64; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:421
  [4] remotecall_fetch
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:421 [inlined]
  [5] workers_all_local()
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/misc.jl:90
  [6] prepare_univar_results(data::Matrix{Float32}, test_name::String, alpha::Float64, hps::Int64, n_obs_min::Int64, FDR::Bool, levels::Vector{Int32}, parallel::String, cor_mat::Matrix{Float32}, correct_reliable_only::Bool, verbose::Bool, tmp_folder::String)
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/learning.jl:86
  [7] LGL(data::Matrix{Float32}; test_name::String, max_k::Int64, alpha::Float64, hps::Int64, n_obs_min::Int64, max_tests::Int64, convergence_threshold::Float64, FDR::Bool, parallel::String, fast_elim::Bool, no_red_tests::Bool, weight_type::String, edge_rule::String, nonsparse_cond::Bool, verbose::Bool, update_interval::Float64, edge_merge_fun::typeof(FlashWeave.maxweight), tmp_folder::String, debug::Int64, time_limit::Float64, header::Vector{String}, meta_variable_mask::Nothing, dense_cor::Bool, recursive_pcor::Bool, cache_pcor::Bool, correct_reliable_only::Bool, feed_forward::Bool, track_rejections::Bool, all_univar_nbrs::Nothing, kill_remote_workers::Bool, kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/learning.jl:229
  [8] macro expansion
    @ ./timing.jl:368 [inlined]
  [9] learn_network(data::Matrix{Any}; sensitive::Bool, heterogeneous::Bool, max_k::Int64, alpha::Float64, conv::Float64, header::Vector{String}, meta_mask::BitVector, feed_forward::Bool, fast_elim::Bool, normalize::Bool, track_rejections::Bool, verbose::Bool, transposed::Bool, prec::Int64, make_sparse::Bool, make_onehot::Bool, max_tests::Int64, hps::Int64, FDR::Bool, n_obs_min::Int64, cache_pcor::Bool, time_limit::Float64, update_interval::Float64, parallel_mode::String)
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/learning.jl:447
 [10] learn_network(data_path::String, meta_data_path::String; otu_data_key::String, otu_header_key::String, meta_data_key::String, meta_header_key::String, verbose::Bool, transposed::Bool, kwargs::Base.Iterators.Pairs{Symbol, Bool, Tuple{Symbol, Symbol}, NamedTuple{(:sensitive, :heterogeneous), Tuple{Bool, Bool}}})
    @ FlashWeave ~/.julia/packages/FlashWeave/464SQ/src/learning.jl:316
 [11] top-level scope
    @ /ebio/abt3_projects/software/dev/FlashWeave.jl/sandbox.jl:48
in expression starting at /ebio/abt3_projects/software/dev/FlashWeave.jl/sandbox.jl:48

This appears to occur if I call addprocs() after using FlashWeave

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.