Git Product home page Git Product logo

linearsegmentation.jl's People

Contributors

shayandavoodii avatar stelmo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

linearsegmentation.jl's Issues

Failing Case with Simple Data

Run the following script:

# External
using LinearSegmentation;
using StableRNGs;
using UnicodePlots;

## Constants & Configuration

oRng = StableRNG(123);

## Functions

function Conv1D( vA :: Vector{T}, vB :: Vector{T}; convMode :: String = "full" ) :: Vector{T} where {T <: Real}

    lenA = length(vA);
    lenB = length(vB);

    if (convMode == "full")
        startIdx    = 1;
        endIdx      = lenA + lenB - 1;
    elseif (convMode == "same")
        startIdx    = 1 + floor(Int, lenB / 2);
        endIdx      = startIdx + lenA - 1;
    elseif (convMode == "valid")
        startIdx    = lenB;
        endIdx      = lenA;
    end

    vO = zeros(T, lenA + lenB - 1);

    for idxB in 1:lenB
        @simd for idxA in 1:lenA
            @inbounds vO[idxA + idxB - 1] += vA[idxA] * vB[idxB];
        end
    end

    return vO[startIdx:endIdx];
end

## Parameters
# Data
numSamples    = 500;
ampFiltSize   = 20;
phaseFiltSize = 50;

vSeg = [1, 151, 301, 401, 501];

# Model
minSegLen = 30.0;
maxRmse   = 0.12;

## Load / Generate Data

vAmp    = rand(oRng, numSamples);
vAmp    = 0.2 * Conv1D(vAmp, ones(ampFiltSize) / ampFiltSize; convMode = "same");
vPhase  = 0.2 * rand(oRng, numSamples);
vPhase  = Conv1D(vPhase, ones(phaseFiltSize) / phaseFiltSize; convMode = "same");
vPhase  = cumsum(vPhase);

vX = LinRange(0, numSamples - 1, numSamples);

vC = vAmp .* cos.(2 * pi * vPhase);
vL = zeros(numSamples);
vL[vSeg[1]:(vSeg[2] - 1)] .= 0;
vL[vSeg[2]:(vSeg[3] - 1)] .= 1;
vL[vSeg[3]:(vSeg[4] - 1)] .= collect(LinRange(0.5, 1.0, 100));
vL[vSeg[4]:(vSeg[5] - 1)] .= collect(LinRange(1.0, 0.4, 100));

vY = vC .+ vL;

## Display Data

ii = 1;
vIdx = vSeg[ii]:(vSeg[ii + 1] - 1)

hP = scatterplot(vX[vIdx], vY[vIdx], width = 90, height = 8, xlim = (vX[1], vX[end]), ylim = (minimum(vY), maximum(vY)));

for ii in 2:(length(vSeg) - 1)
    local vIdx = vSeg[ii]:(vSeg[ii + 1] - 1)
    scatterplot!(hP, vX[vIdx], vY[vIdx]);
end

title!(hP, "Input Data");
xlabel!(hP, "Index");
ylabel!(hP, "Value");
display(hP);

## Analysis
segs = shortest_path(vX, vY; min_segment_length = minSegLen, fit_function = :rmse, fit_threshold = maxRmse);

# Remove the 1st item which is shared
for ii in 2:length(segs)
    deleteat!(segs[ii][1], 1);
end

## Display Results
hP = scatterplot(segs[1][1], vY[segs[1][1]], width = 90, height = 8, xlim = (vX[1], vX[end]), ylim = (minimum(vY), maximum(vY)));

for ii in 2:length(segs)
    scatterplot!(hP, segs[ii][1], vY[segs[ii][1]]);
end

title!(hP, "Linear Segmentation");
xlabel!(hP, "Index");
ylabel!(hP, "Value");
display(hP);

This is the data:

image

Basically an harmonic signal riding a DC and linear functions.
Should be easy case.

The output is:

image

Look at the bottom left of the first linear (Rising), you see magenta colors?
The segments are:

The 1 segment is 1:150
The 2 segment is 151:301
The 3 segment is 302:394
The 4 segment is 395:500

Some overlap in 3 to 4 makes some sense, but for 2 and 3?
Under no RMSE it should be an improvement.

I think it has to do with the shortest path vs. the original idea of interval partition.
Yet I'm not sure.

Segments Share Sample Indices

Looking at the generated indices of the segments I can see they share indices (I saw the last one of a segment is the first of the following segment).

I'd assume the segments should be exclusive.
If not, could such option be added?

Remove GLM.jl dependency

Currently segments are tuples of indices and fitted linear models from GLM.jl. This adds a rather large dependency, which is not strictly speaking necessary. Consider removing it, or making it a weak dependency that is only loaded if the user manually loads GLM. This would also justify the 1.9+ compat requirement of this package...

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Add Option for Scoring based on R2

Currently the score for the equality of the fit is the RMSE.
It will be great to have an option to use the R2 score as well.

Being normalized it should make tweaking the hyper parameter much easier.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.