Git Product home page Git Product logo

Comments (4)

fjxmlzn avatar fjxmlzn commented on July 29, 2024

Sorry for the making the confusion about the features/attributes. You can refer to the paper for detailed explanations.

DATA FORMULATION

A sample contains features and attributes. Attributes are the values associated with the entire sample. Features are the values that occur over time.

As for your data, you can treat the data from each user as one sample; the date of the first record of the user as the attribute; the amount, tag as the features.

So,

The attribute of A0 is:

0 (assuming 2020-01-01 is the first day so that we can use integers to represent the date)

The features of A0 are:

Amount, Tag
200, green
300, blue

The attribute of A2 is:

0

The features of A2 are:

Amount, Tag
218, red
242, pink

The attribute of A3 is:

3

The features of A3 are:

Amount, Tag
38, red

Here I assume that for each user, the records always have consecutive days, so we only need to model the first day. If it is not the case, you can add day difference between consecutive records of a user as an additional feature.

NUMPY FORMAT

Store the data in data_train.npz so that DoppelGANger can read it:

The data_attribute field in this case is a 3x1 matrix of values [[0], [0], [3]].

The data_feature field is a 3x2x5 matrix of values

[[[200, 1, 0, 0, 0], [300, 0, 1, 0, 0]], (use one-hot encoding to represent tags)
 [[218, 0, 0, 1, 0], [242, 0, 0, 0, 1]],
 [[38, 0, 0, 1, 0], [0, 0, 0, 0]]] (zero-padding after the time-series ends)

The data_gen_flag is a 3x2 matrix, of values [[1, 1], [1, 1], [1, 0]].

Here I use the raw values of amount and start date for illustration. You will need to normalize each of them into the range [0, 1] or [-1, 1]. README also contains some explanations of these fields.


Let me know if anything is still unclear.

from doppelganger.

adamFinastra avatar adamFinastra commented on July 29, 2024

Thank you!

from doppelganger.

AlexPars avatar AlexPars commented on July 29, 2024

Question for @fjxmlzn

In the answer below you wrote The data_attribute field is a 3x2x5 matrix of values

Is there a typo, namely "The data_feature field is a 3x2x5 matrix of values" ?

from doppelganger.

fjxmlzn avatar fjxmlzn commented on July 29, 2024

@AlexPars You are right. Thanks for identifying that! Just fixed it.

from doppelganger.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.