Comments (4)
Sorry for the making the confusion about the features/attributes. You can refer to the paper for detailed explanations.
DATA FORMULATION
A sample contains features and attributes. Attributes are the values associated with the entire sample. Features are the values that occur over time.
As for your data, you can treat the data from each user as one sample; the date of the first record of the user as the attribute; the amount, tag as the features.
So,
The attribute of A0 is:
0 (assuming 2020-01-01 is the first day so that we can use integers to represent the date)
The features of A0 are:
Amount, Tag
200, green
300, blue
The attribute of A2 is:
0
The features of A2 are:
Amount, Tag
218, red
242, pink
The attribute of A3 is:
3
The features of A3 are:
Amount, Tag
38, red
Here I assume that for each user, the records always have consecutive days, so we only need to model the first day. If it is not the case, you can add day difference between consecutive records of a user as an additional feature.
NUMPY FORMAT
Store the data in data_train.npz so that DoppelGANger can read it:
The data_attribute field in this case is a 3x1 matrix of values [[0], [0], [3]]
.
The data_feature field is a 3x2x5 matrix of values
[[[200, 1, 0, 0, 0], [300, 0, 1, 0, 0]], (use one-hot encoding to represent tags)
[[218, 0, 0, 1, 0], [242, 0, 0, 0, 1]],
[[38, 0, 0, 1, 0], [0, 0, 0, 0]]] (zero-padding after the time-series ends)
The data_gen_flag is a 3x2 matrix, of values [[1, 1], [1, 1], [1, 0]]
.
Here I use the raw values of amount and start date for illustration. You will need to normalize each of them into the range [0, 1] or [-1, 1]. README also contains some explanations of these fields.
Let me know if anything is still unclear.
from doppelganger.
Thank you!
from doppelganger.
Question for @fjxmlzn
In the answer below you wrote The data_attribute field is a 3x2x5 matrix of values
Is there a typo, namely "The data_feature field is a 3x2x5 matrix of values" ?
from doppelganger.
@AlexPars You are right. Thanks for identifying that! Just fixed it.
from doppelganger.
Related Issues (20)
- membership_inference_attack HOT 6
- CLI getting stuck on running example_training/main.py HOT 2
- Dynamic attributes / attributes with time stamp? HOT 6
- Request for min/max used for feature and attribute normalization in input data HOT 2
- The data generated ranges from 0 to 2 HOT 3
- Incomplete training HOT 7
- unreasonable output HOT 6
- Dataset HOT 1
- About two MLPs HOT 1
- Training does not run although the input is of the required form HOT 6
- Generating time series with negative values HOT 4
- is_gen_flag HOT 4
- Attribute problematic result HOT 18
- Problem with tensorflow HOT 1
- Training time HOT 6
- Code of AR and HMM baseline
- unknown output type HOT 6
- Request for availability of the scripts used to reproduce figures HOT 8
- Inference from attributes HOT 3
- Unable to run main.py in example_training HOT 21
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from doppelganger.