Git Product home page Git Product logo

Comments (7)

IanMayo avatar IanMayo commented on September 3, 2024

@dominikschauer - I've had a go at updating the description. Hopefully it makes more sense now.

from knimeinvestigation.

dmschauer avatar dmschauer commented on September 3, 2024

@IanMayo If I understand the issue correctly it can be solved by doing a simple random sampling out of either usa.csv or nzl.csv and taking the matching time stamps out of the other data set. The next step would be computing the average speed of the two vehicles in the intervals between the randomly taken time stamps. The final step would be creating a graph showing two line graphs (time series) of the speeds at different times.
I will implement this now hoping that it is what is asked for. I also have the suggestion to use fixed intervals of the same length instead of random ones.

from knimeinvestigation.

dmschauer avatar dmschauer commented on September 3, 2024

Here is a KNIME workflow that in my mind does what is requested in this issue.

KNIME_UPWORK_sampling_and_line_graphs.zip

from knimeinvestigation.

IanMayo avatar IanMayo commented on September 3, 2024

@dominikschauer - I haven't encountered random sampling to solve this kind of issue before. My guess would have been to choose the most frequent, and use that as the "time-master", producing interpolated values for the other dataset.

In this instance I believe they're of equal frequency, so we would use either as the master.

from knimeinvestigation.

dmschauer avatar dmschauer commented on September 3, 2024

@IanMayo Yes, I think the idea of random sampling was just due to poor understanding of your requirements on my part. I was thinking this is what you were describing. Otherwise I would not have thought about doing this.

So, do you want to "fill in" the speed for time stamps when there is no record for either of the two data sets? For example when nzl.csv does have a speed for time 08:00:00 but usa.csv has not, I would use the nearest recorded measurements for usa.csv and compute the average. Like: speed_usa(08:00:00) = (speed_usa(07:59:59) + speed_usa(08:00:01) ) / 2. Then I would use this average as interpolation for the missing measurement and compare it to speed_nzl(08:00:00) in a line graph.

I think it would be a bad practice of me to implement this before you gave your okay. So is this what you have in mind?

from knimeinvestigation.

IanMayo avatar IanMayo commented on September 3, 2024

@dinkoivanov - aah, now I see the random confusion. I was referring the preparation of some smaller, temporary datasets, for use in this issue. I suggested you produced custom versions of the data-files that contained random lines from the originals. It was these data-files that would be used to test the time-synced comparison.

Yes your interpolation strategy is fine.

Oops, I've missed an answer. No, let's not produce a value at a time for which there isn't a measurement in both datasets. Let's just interpolate values for times that are present in one (the master) but not the other. Hope this is clear. :-)

from knimeinvestigation.

dmschauer avatar dmschauer commented on September 3, 2024

In this workflow USA.csv serves as the "time-master" and all missing values for NZL.csv are interpolated as described above. I found this interpolation strategy doesn't work when the first or last value is missing. For these trailing missing entries in NZL.csv the first non-missing and last non-missing values are used. For example "NA NA 2 3 4 NA NA NA" becomes "2 2 2 3 4 4 4 4".

KNIME_UPWORK_non_time_synced_data.zip

from knimeinvestigation.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.